You are on page 1of 24

Python for Derivatives Analytics in a Nutshell

Yves J. Hilpisch October 16, 2010

Abstract Financial Engineering in general and Derivatives Analytics in particular are discplines where theory goes hand in hand with numerical methods which in turn have to be implemented in some form of computer code. Python (www.python.org) lends itself very well to this task. This brief note introduces some syntax and paradigms of Python by mainly relying on the example of European option pricing.

Contents
1 Python Fundamentals 1.1 Installing Python(x,y) 1.2 First Steps with Python 1.3 Array Operations . . . . 1.4 Random Numbers . . . 1.5 Plotting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 2 2 4 6 6 8 8 9 14

2 European Option Pricing 2.1 Black-Scholes Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Binomial Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Monte Carlo Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3 Selected Financial Topics 15 3.1 Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.2 Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 3.3 Numerical Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 4 Advanced Python Topics 19 4.1 Classes and Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 4.2 Data Import and Export . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 References

24

Dr. Yves J. Hilpisch, Managing Director, Visixion GmbH, Rathausstrasse 75-79, 66333 Voelklingen, Germany. Email contact@visixion.com, Internet www.visixion.com, www.dexision.com. Preliminary work, comments are welcome.

1 Python Fundamentals
1.1 Installing Python(x,y)
Python(x,y) is a distribution of a Python system and a number of useful libraries and tools for scientic purposes. The Web site www.pythonxy.com provides current downloads and further information. The only drawback is that it is a Windows only version. If you use another system, make sure to install the following: Python 2.6.x (www.python.org) NumPy 1.3.x (numpy.scipy.org) SciPy 0.7.x (www.scipy.org) Matplotlib 0.99 (matplotlib.sourceforge.net) xlrd, xlwt, xutils (www.python-excel.org) Installing either Python(x,y) or the single components is generally pretty easy and fast. You might wonder whether I do not recommend the most recent version of Python which is already 3.x. There are two reasons. Python 2.6.x is also current and maintained by the Open Source community. Some syntax has changed in 3.x such that the versions are not fully compatibleand most code and documentation available is for Python 2.x. By reading this note, you will NOT learn how to code in general or learn Python from scratch and to black belt level. However, for someone coming with C++ experience, for example, the note illustrates fundamental aspects of Python that are useful for Derivatives Analytics and Financial Engineering. For someone who starts out in these areas, the topics covered provide a rst glimpse at coding in general and for Derivatives Analytics in particular. For this group, the note may act as a starting point for digging deeper into areas of further interest. For a thorough introduction to Python from a scientic point of view, you should consult Langtangen (2009).

1.2 First Steps with Python


After starting IDLE, the standard integrated development environment which comes with any Python distribution, you should see something like this on the screen:
Python 2.6.5 (r265:79096, Mar 19 2010, 21:48:26) [MSC v.1500 32 bit (Intel)] on win32 Type "copyright", "credits" or "license()" for more information. **************************************************************** Personal firewall software may warn about the connection IDLE makes to its subprocess using this computers internal loopback interface. This connection is not visible on any external interface and no data is sent to or received from the Internet. **************************************************************** IDLE 2.6.5 >>>

Before we go on and write our rst moduleas it is called in Pythonlets use this command line interpreter for some math.

>>> 3+4 7 >>> 3/4 0 >>>

Addition seems to work right, but division apparently not. This is due to Python interpreting 3 and 4 as integers such that division gives 0 instead of 0.75. Putting a dot behind either 3 or 4 or both does the trick (i.e. we say Python that we are working with oats).
>>> 3.0/4 0.75 >>>

So types are important with Python. One has to be careful since Python is a dynamically typed language which means that there are default types which are used given a specic context. In C++ for example you would have to assign a certain type to a variable before using it. Variables are dened in Python with the = sign:
>>> a=3 >>> b=4 >>> a/b 0 >>> a=3.0 >>> a/b 0.75 >>>

Even if Python has already built in lots of functionality, most of it is stored in modules which have to be imported. An example is the math module which contains, among others, trigonometric functions.
>>> sin(a) Traceback (most recent call last): File "<pyshell#8>", line 1, in <module> sin(a) NameError: name sin is not defined >>> from math import * >>> sin(a) 0.14112000805986721 >>>

If you want to indicate that the sin function is from the math module, you can also import the module itself and not the functions that are contained therein.
>>> import math >>> math.sin(b) -0.7568024953079282 >>>

You can easily dene functions by yourself.

>>> def f(x): return x**3+x**2-2 >>> f(2) 10 >>> f(a) 34.0 >>>

Here, x**3 is for x3 . Generally, if you are doing something useful which you would like to store for later use you would not work with the command line interpreter. You would rather write a module, store the function in it and save it on disk. You can start this any time by clicking on New Window in the File menu of IDLE. There you could store the previous code like this:
# # First Program with Python # a_First_Program . py # from math import * # Variable Definition a=3.0 b=4 # Function Definition def f ( x ): return x ** 3 + x ** 2 - 2 # Calculation f_a = f ( a ) f_b = f ( b ) # Output print " f ( a ) = " , f_a print " f ( b ) = " , f_b

The # sign allows to include comments in your code that is ignored by the Python interpreter. Make sure when saving Python modules to always include the sux .py. Now you can use F5 to start the program which produces the following output:
>>> ================================ RESTART ================================ >>> f(a) = 34.0 f(b) = 78 >>>

This should be enough for the rst steps with Python. We calculated, dened a function and wrote the rst module containing the function.

1.3 Array Operations


NumPy is a powerful library that allows array manipluations (linear algebra) in a compact form and at high speed. The speed comes from the implementation in C. So you have the

convenience of Python combined with the speed of C when doing array operations.
>>> from numpy import * >>> a=arange(0.0,20.0,1.0) >>> a array([ 0., 1., 2., 3., 4., 5., 11., 12., 13., 14., 15., 16., >>> a.resize(4,5) >>> a array([[ 0., 1., 2., 3., 4.], [ 5., 6., 7., 8., 9.], [ 10., 11., 12., 13., 14.], [ 15., 16., 17., 18., 19.]]) >>> a[0] array([ 0., 1., 2., 3., 4.]) >>> a[3] array([ 15., 16., 17., 18., 19.]) >>> a[1,4] 9.0 >>> a[1,2:4] array([ 7., 8.]) >>>

6., 17.,

7., 18.,

8., 19.])

9.,

10.,

The rst examples of array denition and manipulation should be self-explaining. Care is to be taken with the conventions regarding array indices. The best way to learn these is to play with arrays. With NumPy, array operations are as easy as operations on integers or oats.
>>> a*0.5 array([[ 0. , [ 2.5, [ 5. , [ 7.5, >>> a**2 array([[ 0., [ 25., [ 100., [ 225., >>> a+a array([[ 0., [ 10., [ 20., [ 30., >>> 0.5, 3. , 5.5, 8. , 1. , 3.5, 6. , 8.5, 1.5, 4. , 6.5, 9. , 2. ], 4.5], 7. ], 9.5]]) 16.], 81.], 196.], 361.]])

1., 36., 121., 256., 2., 12., 22., 32.,

4., 49., 144., 289.,

9., 64., 169., 324.,

4., 14., 24., 34.,

6., 16., 26., 36.,

8.], 18.], 28.], 38.]])

We can also use our previoulsy dened function f.


>>> f(a) array([[ -2.00000000e+00, 3.40000000e+01, [ 1.48000000e+02, 5.74000000e+02, [ 1.09800000e+03, 2.36400000e+03, 0.00000000e+00, 7.80000000e+01], 2.50000000e+02, 8.08000000e+02], 1.45000000e+03, 2.93800000e+03], 1.00000000e+01, 3.90000000e+02, 1.87000000e+03,

3.59800000e+03, 6.15400000e+03,

4.35000000e+03, 5.20000000e+03, 7.21800000e+03]])

Here, the syntax e+03 is for 103 . Sometimes you need to loop over arrays to check something. Looping is also quite intuitive in Python.
>>> b=arange(0.0,100.0,1.0) >>> for i in range(100): if b[i]==50.0: print "50.0 at index no.", i

50.0 at index no. 50 >>>

The use of arange and range should be obvious. The rst can produce arrays of oat type while the latter can only generate integers; and indices of arrays are always integers that is why we loop over integers and not over oats or something else.

1.4 Random Numbers


Derivatives Analytics cannot live without random numbers, be them either pseudo-random or quasi-random. NumPy has built in convenient functions for the generation of pseudorandom numbers in the sub-module random.
>>> from numpy.random import * >>> b=standard_normal((4,5)) >>> b array([[-0.59317286, 0.27533818, -0.46122351, -0.05138033, -1.8371135 ], [-1.15520074, 1.04980946, 0.31082909, 0.32662006, -0.36752163], [ 0.66452767, -0.88077193, 1.18253972, 0.16836824, -1.40541028], [ 0.01481426, -0.88137549, 0.74594197, -0.97360666, -0.77270426]]) >>> a+b array([[ -0.59317286, 1.27533818, 1.53877649, 2.94861967, 2.1628865 ], [ 3.84479926, 7.04980946, 7.31082909, 8.32662006, 8.63247837], [ 10.66452767, 10.11922807, 13.18253972, 13.16836824, 12.59458972], [ 15.01481426, 15.11862451, 17.74594197, 17.02639334, 18.22729574]]) >>>

1.5 Plotting
More often than not, one wants to visualze results from calculations or simulations. The module matplotlib is quite powerful when it comes to 2D visualizations of any kind. The most important types of graphics for Derivatives Analytics are lines, dots and bars.
>>> from matplotlib.pyplot import * >>> plot(a+b) [<matplotlib.lines.Line2D object at 0x020A7E30>, <matplotlib.lines.Line2D object at 0x0218E690>, <matplotlib.lines.Line2D object at 0x0218E730>, <matplotlib.lines.Line2D object at 0x0218E7B0>, <matplotlib.lines.Line2D object at 0x0218E830>] >>> xlabel(x-axis) <matplotlib.text.Text object at 0x020BC590>

>>> ylabel(y-axis) <matplotlib.text.Text object at 0x020BCEF0> >>> grid(True) >>> show()

Figure 1 shows the output. Notice that matplotlib produces ve lines with four dierent values each which is due to the array size of 4 5. The next example combines a dot sub-plot with a bar sub-plot the result of which is shown in gure 2. Here, due to resizing of the array we have only a one-dimensional set of numbers.
>>> c=resize(b,20) >>> subplot(211) <matplotlib.axes.AxesSubplot object at 0x0CD547F0> >>> plot(c,ro) [<matplotlib.lines.Line2D object at 0x01A44C90>] >>> subplot(212) <matplotlib.axes.AxesSubplot object at 0x0CF93150> >>> bar(range(20),c) [<matplotlib.patches.Rectangle object at 0x0D016230>, ...] >>> show()

This is already quite all we need to implement dierent European option pricing algorithms in the next section. What may be missing will be added on the y.

20 15 y-axis 10 5 0 5 0.0 0.5 1.0 1.5 x-axis 2.0 2.5 3.0

Figure 1: Example of gure with matplotlibhere: lines

1.5 1.0 0.5 0.0 0.5 1.0 1.5 2.00 1.5 1.0 0.5 0.0 0.5 1.0 1.5 2.00

10

15

20

10

15

20

Figure 2: Example of gure with matplotlibhere: dots & bars

2 European Option Pricing


2.1 Black-Scholes Approach
The seminal model of Black and Scholes (1973) is still a benchmark for the pricing of European options stocks and indices. The analytical call option formula without dividends is C0 (K, T ) = S0 N(d1 ) erT KN(d2 ) log(S0 /K) + (r + 0.5 2 )T d1 T d2 d1 T where N is the cumulative distribution function (cdf) of a standard normal random variable. The single variables have the following meaning, respectively: C0 call option value today S0 index level today K strike price of the option T time-to-maturity of the call option r risk-less short rate volatility of index level (standard deviation of its returns) All we need additionally to implement the formula is the cdf for a standard normal variable. We get this from the scipy module which contains a sub-module called stats.
# # Valuation of European Call Option in BS73 Model # b_BS1973 . py

# from scipy import stats from math import * # Option Parameters s0 = 105 . 00 # Initial Index Level K = 100 . 00 # Strike Level T = 1. # Call Option Maturity r = 0 . 05 # Constant Short Rate vola = 0 . 25 # Constant Volatility of Diffusion # Analytical Formula def BS73_Call_Value ( s0 ,K ,T ,r , vola ): d1 = ( log ( s0 / K )+( r + 0 . 5 * vola ** 2 )* T )/( vola * sqrt ( T )) d2 = d1 - vola * sqrt ( T ) BS_C = ( s0 * stats . norm . cdf ( d1 , 0 .0 , 1 . 0 ) -K * exp ( - r * T )* stats . norm . cdf ( d2 , 0 .0 , 1 . 0 )) return BS_C # Output print " Value of European call option is " , BS73_Call_Value ( s0 ,K ,T ,r , vola )

The function BS73 Call Value gives us a benchmark value for the European call option with the parameters as dened in the Python module:
>>> ================================ RESTART ================================ >>> Value of European call option is 15.6547197268 >>>

2.2 Binomial Approach


To better understand how to implement the binomial option pricing model of Cox, Ross and Rubinstein (1979), a little more background seems helpful. There are two securities traded in the model: a risky stock index and a risk-less zerocoupon bond. The time horizon [0, T ] is divided in equidistant time intervals t so that one gets T /t + 1 points in time t {0, t, 2 t, ..., T }. The zero-coupon bond grows p.a. in value with the risk-less short rate r, Bt = B0 ert where B0 > 0. Starting from a strictly positive, xed stock index level of S0 at t = 0, the stock index evolves according to the law St+t St m where m is selected randomly from {u, d}. Here, 0 < d < ert < u e t as well as 1 u d as a simplication which leads to a recombining tree. Assuming risk-neutral valuation holds, the following relationship can be derived St = ert EQ [St+t ] t = ert (quSt + (1 q)dSt ) Against this background, the risk-neutral (or martingale) probability is

q=

ert d ud

The value of a European call option C0 is then obtained by discounting the nal payos CT (ST , K) max[ST K, 0] at t = T to t = 0: C0 = erT EQ [CT ] 0 The discounting can be done step-by-step and node-by-node backwards starting at t = T t. From an algorithmical point of view, one has to rst generate the index level values, determines then the nal payos of the call option and nally discounts them back. This is what we now will do, starting with a somewhat naive implementation. But before we do it, we generate a Python module which contains all parameters that we will need for dierent implementations afterwards. All parameters can be imported by using the import command and the respective lename without the sux .py (i.e. the lename is c Parameters.py and the module name is c Parameters).
# # Model Parameters for European Call Option and Binomial Models # c_Parameters . py # from math import exp , sqrt # Option Parameters s0 = 105 . 00 # Initial Index Level K = 100 . 00 # Strike Level T = 1. # Call Option Maturity r = 0 . 05 # Constant Short Rate vola = 0 . 25 # Constant Volatility of Diffusion # Time Parameters t = 3 # Time Intervals delta = T / t # Length of Time Interval df = exp ( - r * delta ) # Discount Factor # u d q Binomial Parameters = exp ( vola * sqrt ( delta )) = 1/u = ( exp ( r * delta ) - d )/( u - d )

# Up - Movement # Down - Movement # Martingale Probability

Here is now the rst version of the binomial model which uses Excel-like cell iterations extensively. We will see that there are ways to a more compact and faster implementation.
# # Valuation of European Call Option in CRR1979 Model # Naive Version (= Excel - like Iterations ) # d_CRR1979_Naive . py # from numpy import * from c_Parameters import *

10

# Array Initialization for Index Levels s = zeros (( t +1 , t + 1 ) , ' float ') s [0 , 0 ] = s0 z = 0 for j in range (1 , t +1 , 1 ): z = z+1 for i in range ( z + 1 ): s [i , j ] = s [0 , 0 ]*( u ** j )*( d **( i * 2 )) # Array Initialization for Inner Values iv = zeros (( t +1 , t + 1 ) , ' float ') z = 0 for j in range (0 , t +1 , 1 ): for i in range ( z + 1 ): iv [i , j ] = round ( max ( s [i , j ] -K , 0 ) , 8 ) z = z+1 # Valuation pv = zeros (( t +1 , t + 1 ) , ' float ') # Present Value Array pv [: , t ] = iv [: , t ] # Last Time Step Initial Values z = t+1 for j in range (t -1 , -1 , - 1 ): z = z-1 for i in range ( z ): pv [i , j ] = ( q * pv [i , j + 1 ]+( 1 - q )* pv [ i +1 , j + 1 ])* df # Output print " Value of European call option is " , pv [0 , 0 ]

The command zeros((i,j),float) initializes a NumPy array with dimension i j where each number is of type oat. A run of the module gives the following output and arrays where one can follow the three steps easily (index levels, inner values, discounting):
>>> Value of European call option is 16.2929324488 >>> s array([[ 105. , 121.30377267, 140.13909775, 161.89905958], [ 0. , 90.88752771, 105. , 121.30377267], [ 0. , 0. , 78.67183517, 90.88752771], [ 0. , 0. , 0. , 68.09798666]]) >>> iv array([[ 5. , 21.30377267, 40.13909775, 61.89905958], [ 0. , 0. , 5. , 21.30377267], [ 0. , 0. , 0. , 0. ], [ 0. , 0. , 0. , 0. ]]) >>> pv array([[ 16.29293245, 26.59599847, 41.79195237, 61.89905958], [ 0. , 5.61452766, 10.93666406, 21.30377267], [ 0. , 0. , 0. , 0. ], [ 0. , 0. , 0. , 0. ]]) >>>

Our alternative version makes more use of the capabilities of NumPythe consequence is more compact code even if it is not so easy to read in a rst instance.

11

# # Valuation of European Call Option in CRR1979 Model # Advanced Version (= NumPy Iterations ) # d_CRR1979_Naive . py # from numpy import * from c_Parameters import * # Array Initialization for Index Levels mu = arange ( t + 1 ) mu = resize ( mu ,( t +1 , t + 1 )) md = transpose ( mu ) mu = u **( mu - md ) md = d ** md s = s0 * mu * md # Valuation pv = maximum (s -K , 0 ) Qu = zeros (( t +1 , t + 1 ) , ' float ') Qu [: ,:] = q Qd = 1 - Qu z = 0 for i in range (t -1 , -1 , - 1 ): pv [ 0 :t -z , i ] = ( Qu [ 0 :t -z , i ]* pv [ 0 :t -z , i + 1 ]+ Qd [ 0 :t -z , i ]* pv [ 1 :t - z +1 , i + 1 ])* df z = z+1 # Output print " Value of European call option is " , pv [0 , 0 ]

The valuation result is, as expected, the same for the parameter denitions from before. However, three time intervals are of course not enough to come close to the Black-Scholes benchmark of 15.6547197268. With 1,000 time intervals, however, the algorithms come quite close to it:
>>> ================================ RESTART ================================ >>> Value of European call option is 15.6537846075 >>>

The major dierence between the two algorithms is execution time. The second implementation which avoids Python iterations as much as possible is about 30 times faster than the rst one. You should make this a principle for your own coding eorts: whenever possible avoid necessary iterations in Python and delegate them to NumPy. Apart from time savings, you generally also get more compact and readable code. A direct comparison illustrates this point:
# Naive Version --- Iterations in Python # # Array Initialization for Inner Values iv = zeros (( t +1 , t + 1 ) , ' float ') z = 0

12

for j in range (0 , t +1 , 1 ): for i in range ( z + 1 ): iv [i , j ] = max ( s [i , j ] -K , 0 ) z = z+1 # Advanced Version --- Iterations with NumPy / C # pv = maximum (s -K , 0 )

To conclude this section, I want to apply the Fast Fourier Transform (FFT) algorithm to the binomial model. Nowadays this numerical routine plays a central role in Derivatives Analytics. It is used regularly for plain vanilla option pricing in productive environments in investment banks or hedge funds. In general, however, it is not applied to a binomial model but the application in this case is straightforward and therefore a quick win for us.1
# # Valuation of European Call Option in CRR1979 Model # FFT Version ( no Python iterations at all ) # f_CRR1979_FFT . py # from numpy import * from numpy . fft import fft , ifft from c_Parameters import * # Array Generation for Index Levels md = arange ( t + 1 ) mu = resize ( md [ - 1 ] , t + 1 ) mu = u **( mu - md ) md = d ** md s = s0 * mu * md # Valuation by FFT C_T = maximum (s -K , 0 ) Q = zeros ( t +1 , ' float ') Q[0] = q Q[1] = 1-q l = sqrt ( t + 1 ) v1 = ifft ( C_T )* l v2 = ( sqrt ( t + 1 )* fft ( Q )/( l *( 1 + r * delta )))** t C_0 = fft ( v1 * v2 )/ l # Output print " Value of European call option is " , real ( C_0 [ 0 ])

In this module, Python iterations are all avoidedthis is possible since for European options only the nal payos are relevant. The speed advantage of this algorithm is again considerable: it is 30 times faster than our advanced algorithm from before and 900 times faster than the naive version.

Refer to Crn (2004) for details of this method and its application to a binomial model. e y

13

2.3 Monte Carlo Approach


Finally, we apply Monte Carlo simulation (MCS) to value the same European call option. Here it is where pseudo-random numbers come into play. Similarly to the FFT algorithm we only care about the nal index level at T and simulate it with pseudo-random numbers. We get the following simple simulation algorithm2 : consider the date of maturity T and write ST = S0 e(r 2 start iterating i = 1, 2, ..., I draw a standard normally distributed pseudo-random number wT (i) determine at T the index level ST (i) by applying the pseudo-random number to equation (1) determine the inner value of the call at T as max[ST (i) K, 0] iterate until i = I sum up all inner values at T , take the average and discount back to t = 0: C0 (K, T ) erT 1 I max[ST (i) K, 0]
I
1 2 )T +

T wT

(1)

this is the MCS estimator for the European call option value Although the word iterating sounds like looping over arrays we can again avoid array loops completely on the Python level. The Python/NumPy implementation is really compact only 5 lines of code for the core algorithm. With another 5 lines we can produce a histogram of the index levels at T as displayed in gure 3.
# # Valuation of European Call Option via Monte Carlo Simulation # g_MCS . py # from numpy . random import * from matplotlib . pyplot import * from c_Parameters import * from numpy import * # Valuation via MCS paths = 300000 rand = standard_normal ( paths ) sT = s0 * exp (( r - 0 . 5 * vola ** 2 )* T + sqrt ( T )* vola * rand ) pv = sum ( maximum ( sT -K , 0 )* exp ( - r * T ))/ paths print " Value of European call option is " , pv # Graphical Analysis figure () hist ( sT , 100 ) xlabel ( ' stock value at T ') ylabel ( ' frequency ') show ()

14

20000

15000

frequency

10000

5000

0 50

50

100 150 index level at T

200

250

Figure 3: Histogram of simulated stock index levels at T

The algorithm produces a quite accurate estimate for the European call option value although the implementation is rather simplistic (i.e. there are, for example, no variance reduction techniques involved):
>>> ================================ RESTART ================================ >>> Value of European call option is 15.6306695905 >>>

3 Selected Financial Topics


3.1 Approximation
It is often the case in Derivatives Analytics that one has to approximate something to draw conclusions. Two important approximation techniques are regression and interpolation.3 The type of regression we consider is called ordinary least squares regression (OLS). In its most simple form, ordinary polynomials x, x2 , x3 , ... are used to approximate a desired function y = f (x) given a number N of obervations (y1 , x1 ), (y2 , x2 ), ..., (yN , xN ). Say we want to approximate f (x) with a polynomial of order 2, g(x) = a1 + a2 x + a3 x2 where the ai are regression parameters. The task is then to
N a1 ,a2 ,a3
2 3

min

(yn g(xn ; a1 , a2 , a3 ))
n

Glasserman (2004) is a comprehensive reference on the Monte Carlo method. Brandimarte (2006, sec. 3.3) introduces into these techniques.

15

As an example, we want to approximate the cosine function over the interval [0, /2] given 20 observations. The code is straightforward since NumPy has built-in functions polyfit and polyval. From polyfit you get the minimizing regression parameters back, while you can use them with polyval to generate values based on these parameters. The result is shown in gure 4 for three dierent regression functions.
# # Ordinary Least Squares Regression # h_REG . py # from numpy import * from matplotlib . pyplot import * # Regression x = linspace ( 0 .0 , pi /2 , 20 ) y = cos ( x ) g1 = polyfit (x ,y , 0 ) g2 = polyfit (x ,y , 1 ) g3 = polyfit (x ,y , 2 ) g1y = polyval ( g1 , x ) g2y = polyval ( g2 , x ) g3y = polyval ( g3 , x ) # Graphical Analysis plot (x ,y , 'y ') plot (x , g1y , ' rx ') plot (x , g2y , ' bo ') plot (x , g3y , 'g > ')

The concept of interpolation is much more involved but nevertheless almost as straightforward in applications. The most common type of interpolation is with cubic splines for which you nd functions in the sub-module scipy.interpolate. The example remains the same and the code is as compact as before while the resultsee gure 5seems almost perfect.
# # Cubic Spline Interpolation # i_SPLINE . py # from numpy import * from scipy . interpolate import * from matplotlib . pyplot import * # Interpolation x = linspace ( 0 .0 , pi /2 , 20 ) y = cos ( x ) gp = splrep (x ,y , k = 3 ) gy = splev (x , gp , der = 0 ) # Graphical Analysis plot (x ,y , 'b ') plot (x , gy , ' ro ')

16

1.2 1.0 0.8 0.6 0.4 0.2 0.0 0.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6

Figure 4: Approximation of cosine function (line) with constant regression (red crosses), linear regression (blue dots) and quadratic regression (green triangles)

1.2 1.0 0.8 0.6 0.4 0.2 0.0 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6

Figure 5: Approximation of cosine function (line) with cubic splines interpolation (red dots)

17

Roughly speaking, cubic splines interpolation is (intelligent) regression between every two observation points with a polynomial of order 3. This is of course much more exible than a single regression with a polynomial of order 2. Two drawbacks in algorithmic terms are, however, that the observations have be ordered in the x-dimension. Furthermore, cubic splines are of limited or no use for higher dimensional problems where OLS regression is applicable as easy as in the two-dimensional world.

3.2 Optimization
Strictly speaking, regression and interpolation are two special forms of optimization (some kind of minimization). However, optimization techniques are needed much more often in Derivatives Analytics. An important area is, for example, the calibration of model parameters to a given set of market-observed option prices or implied volatilities. The two major approaches are global and local optimization. While the rst looks for a global minimum or maximum of a function (which does not have to exist at all), the second looks for a local minimum or maximum. As an example, we take the sine function over the intervall [, 0] with a minimum function value of 1 at /2. Again, the module scipy delivers respective functions via the sub-module optimize. The code looks like follows:
# # Finding a Minimum # j_OPT . py # from numpy import * from scipy . optimize import * # Finding a Minimum def y ( x ): if x < - pi or x > 0 : return 0 . 0 return sin ( x ) gmin = brute (y ,(( - pi ,0 , 0 . 01 ) ,) , finish = None ) lmin = fmin (y , - 0 . 5 ) # Output print " Global Minimum is " , gmin print " Local Minimum is " , lmin

Both functions brute (global brute force algorithm) and fmin (local convex optimization algorithm) also work in multi-dimensional settings. In general, the solution of the local optimization is strongly dependent on the initialization; here the 0.5 did quite well in reaching /2 as the solution.
>>> ================================ RESTART ================================ >>> Optimization terminated successfully. Current function value: -1.000000 Iterations: 18 Function evaluations: 36 Global Minimum is -1.57159265359 Local Minimum is [-1.57080078] >>>

18

3.3 Numerical Integration


It is not always possible to analytically integrate a given function. Then numerical integration often comes into play. We want to check numerical integration where we can do it analytically as well:
1

ex dx
0

The value of the integral is e1 e0 1.7182818284590451. For numerical integration, again scipy helps out with the sub-module integrate which contains the function quad, implementing a numerical quadrature scheme4 :
# # Numerically Integrate a Function # j_INT . py # from numpy import * from scipy . integrate import * # Numerical Integration def f ( x ): return exp ( x ) Int = quad ( lambda u : f ( u ) ,0 , 1 )[ 0 ] # Output print " Value of the Integral is " , Int

The output of the numerical integration equals the analytical value (with rounding):
>>> ================================ RESTART ================================ >>> Value of the Integral is 1.71828182846 >>>

4 Advanced Python Topics


4.1 Classes and Objects
So far, we came by mainly with modules and functions. The dominating coding paradigm of our time is, however, object-oriented programming. For example, the popularity of C++ for Derivatives Analytics stems to a great extent from the fact that it brings along powerful object-orientation. On a rather basic level, almost anything is an object in Python. What we want to do now is to implement new classes of objects, i.e. we go one level higher. For example, we can dene a new class for European call options. A class is characterized by its attributes which are stored in a function with name init and so-called methods, like the valuation function of Black and Scholes (1973) as already implemented before. Here is a sample code for two classes:
4

Brandimarte (2006, ch. 4) introduces into numerical integration.

19

# # Two Option Classes # k_CLASS . py # from numpy import * from scipy import stats # Class Definitions class Option : def __init__ ( self , s0 ,K ,T ,r , vola ): self . s0 = s0 self . K = K self . T = T self . r = r self . vola = vola def Value ( self ): s0 ,K ,T ,r , vola = self . s0 , self .K , self .T , self .r , self . vola d1 = ( log ( s0 / K )+( r + 0 . 5 * vola ** 2 )* T )/( vola * sqrt ( T )) d2 = d1 - vola * sqrt ( T ) val = ( s0 * stats . norm . cdf ( d1 , 0 .0 , 1 . 0 ) -K * exp ( - r * T )* stats . norm . cdf ( d2 , 0 .0 , 1 . 0 )) return val class Option_Vega ( Option ): def Vega ( self ): s0 ,K ,T ,r , vola = self . s0 , self .K , self .T , self .r , self . vola d1 = ( log ( s0 / K )+( r +( 0 . 5 * vola ** 2 ))* T )/( vola * sqrt ( T )) return s0 * stats . norm . cdf ( d1 , 0 .0 , 1 . 0 )* sqrt ( T )

The working becomes clear after executing the module and dening option objects by parameterizing the dierent classes:
>>> ================================ RESTART ================================ >>> >>> o1 = Option(105.,100.,1.0,0.05,0.25) >>> o1.Value() 15.654719726823579 >>> o1.Vega() Traceback (most recent call last): File "<pyshell#18>", line 1, in <module> o1.Vega() AttributeError: Option instance has no attribute Vega >>> o2 = Option_Vega(105.,100.,1.0,0.05,0.25) >>> o2.Value() 15.654719726823579 >>> o2.Vega() 73.345040765170197 >>>

The class Option only contains a method called Value. The value of the option object o1 can be retrieved via invoking the method as in o1.Value(). However, the class Option

20

has no method to calculate the Vega5 of the option. This, however, is what is included in the class Option Vega. This class has been dened on the basis of the Option class via class Option Vega(Option) and inherits the attributes and methods of the other class. That is why we parameterize an object of this class in the same way and why we can calculate its value in the same way.

4.2 Data Import and Export


Saving and loading Python modules is really simple. However, the need to save and load Python objects also arises frequently. In this section, I want to illustrate two simple ways of storing data permanently. The rst is to save and load Python objects in a pure Python envrionment. The second stores date in and retrieves data from Microsoft Excel spreadsheet les. This is a very important functionality since Excel still is one of the dominating front-oce tools in investment banks, hedge funds, etc. Suppose we want to save our two option objects o1 and o2 to a le on disk. A respective session in IDLE could look like the following:
>>> ================================ RESTART ================================ >>> >>> o1 = Option(105.,100.,1.0,0.05,0.25) >>> o2 = Option_Vega(105.,100.,1.0,0.05,0.25) >>> from cPickle import * >>> options = open("Option_Container","w") >>> dump(o1,options) >>> dump(o2,options) >>> options.close() >>> options <closed file Option_Container, mode w at 0x02DE2AC0> >>> ================================ RESTART ================================ >>> >>> options = open("Option_Container","r") >>> o1=load(options) >>> o2=load(options) >>> o1.Value() 15.654719726823579 >>> o1.Vega() Traceback (most recent call last): File "<pyshell#50>", line 1, in <module> o1.Vega() AttributeError: Option instance has no attribute Vega >>> o2.Vega() 73.345040765170197 >>> options.close() >>>

Notice that the objects are loaded in the sequence as stored. And you can never know (if you didnt save the information as well) how many objects are there in the le. So it could be a good idea to store the two option objects not separately but as a list.
5

The Vega of an option is the rst derivative of the option value with respect to the volatility, i.e. C0 /.

21

>>> options = open("Option_Container_2","w") >>> dump((o1,o2),options) >>> options.close() >>> ================================ RESTART ================================ >>> >>> from cPickle import * >>> options = open("Option_Container_2","r") >>> o = load(options) >>> o (<__main__.Option instance at 0x03875080>, <__main__.Option_Vega instance at 0x039F5F58>) >>> len(o) 2 >>> o[1] <__main__.Option_Vega instance at 0x039F5F58> >>> o[0] <__main__.Option instance at 0x03875080> >>> o[0].Value() 15.654719726823579 >>> o[1].Vega() 73.345040765170197 >>>

This seems to make live much easier. The next and last topic is to read and write data from and to Excel spreadsheets. To this end, a sample Excel workbook is needed. I produced one with DAX index quotes from calender week 34 (source: nance.yahoo.com). The les name is DAX.xls and it contains data as displayed in gure 6

Figure 6: Excel sample sheet with DAX quotes from calender week 34; source: nance.yahoo.com

To access and print the data contained in the Excel le, a module like this does the job:
# # Reading Data from Excel Workbooks # m_Excel_Read . py # from xlrd import open_workbook # Open Workbook , Read and Print xls = open_workbook ( ' DAX . xls ') for s in xls . sheets (): print ' Sheet : ' ,s . name

22

print ' Worksheet has ' ,s . nrows , ' rows with data ' print ' Worksheet has ' ,s . ncols , ' columns with data ' for row in range ( s . nrows ): data = [] for col in range ( s . ncols ): data . append ( str ( s . cell ( row , col ). value )) print " ," . join ( data ) print

This module, once started, produces the following output:


>>> ================================ RESTART ================================ >>> Sheet: DAX Quotes Worksheet has 6 rows with data Worksheet has 7 colums with data Date,Open,High,Low,Close,Volume,Adj Close Aug 27, 2010,5,899.99,5,957.32,5,845.49,5,951.17,26,027,900,5,951.17 Aug 26, 2010,5,937.27,5,948.91,5,896.52,5,912.58,23,078,700,5,912.58 Aug 25, 2010,5,925.48,5,954.96,5,837.90,5,899.50,29,686,600,5,899.50 Aug 24, 2010,5,962.87,5,975.83,5,869.30,5,935.44,26,305,300,5,935.44 Aug 23, 2010,6,017.19,6,055.23,5,995.37,6,010.91,19,657,100,6,010.91 >>>

A friend of mine is a seasoned investor and has some exposure to the DAX index. To please him, I will now manipulate the Close and Adj Close values such that they reect a strong gain to 6,500.00 on Fridays closing.
# # Writing Data in Excel Workbooks # m_Excel_Write . py # from xlutils . copy import copy from xlrd import open_workbook # Open Workbook , Copy It , Change Value , Overwrite it xls = open_workbook ( ' DAX . xls ' , formatting_info = True ) wb = copy ( xls ) # Writable Copy Only wsh = wb . get_sheet ( 0 ) # Get First Sheet wsh . write (1 ,4 , '6 , 500 . 00 ') # Manipulate Data wsh . write (1 ,6 , '6 , 500 . 00 ') # Dito wb . save ( ' DAX . xls ') # Save Under Same Filename ( Overwrite )

The workaround with a copy of the original workbook is necessary here. You could also save the changed copy under a dierent lename to preserve the old Excel le.

23

References
[1] Black, Fischer and Myron Scholes (1973): The Pricing of Options and Corporate Liabilities. Journal of Political Economy, Vol. 81, No. 3, 638-659. [2] Brandimarte, Paolo (2006): Numerical Methods in Finance and Economics. 2nd ed., John Wiley & Sons, Hoboken, New Jersey. [3] Cerny, Ale (2004): Introduction to Fast Fourier Transform in Finance. Journal s of Derivatives, Vol. 12, No. 1, 73-88. [4] Cox, John, Stephen Ross und Mark Rubinstein (1979): Option Pricing: A Simplied Approach. Journal of Financial Economics, Vol. 7, No. 3, 229-263. [5] Glasserman, Paul (2004): Monte Carlo Methods in Financial Engineering. Springer Verlag, New York et.al. [6] Langtangen, Hans Petter (2009): A Primer on Scientic Programming with Python. Springer Verlag, Berlin.

24

You might also like