You are on page 1of 41

Introduction to Python

Arun Kumar

IIT Ropar

November 18, 2017

1 / 41
Outline of the Talk

1 Introduction to Python
Scientific Stack
Quotes about Python
2 Working With Python
Python Containers
Conditionals/Iteration/Looping/ Functions/Modules
3 Introduction to NumPy
Solving system of linear equations
4 Introduction to Matplotlib
Scatter Plot
Plotting a histogram
5 Introduction to Pandas
Pandas Data Structures
6 Scipy

7 More Functions...

2 / 41
3 / 41
What is Python ?

1 Flexible, powerful language with FOSS license

2 Easy and compact syntax

3 Batteries included i.e. it comes with a large library of useful modules.

4 Free as in free beer which is an open source beer project.

5 Its designer, Guido Van Rossum took the name form BBC comedy
series Monty Pythons Flying Circus".

6 Website: http://www.python.org

4 / 41
Scientific Stack

NumPy
provides support for large, multi-dimensional arrays and matrices.

Pandas
pandas builds on NumPy and provides richer classes for the management and analysis of time
series and tabular data.

SciPy
contains modules for optimization, linear algebra, integration, interpolation, special functions, FFT,
signal and image processing, ODE solvers etc.

matplotlib
This is the most popular plotting and visualization library for Python, providing both 2D and 3D
visualization capabilities.

rpy2
The high-level interface in rpy2 is designed to facilitate the use of R by Python programmers.

5 / 41
Quotes about Python

Python is fast enough for our site and allows us to produce


maintainable features in record times, with a minimum of
developers," said Cuong Do, Software Architect,
YouTube.com.

YouTube

Python has been an important part of Google since the


beginning, and remains so as the system grows and
evolves. Today dozens of Google engineers use Python,
and were looking for more people with skills in this
language." said Peter Norvig, director of search quality at
Google, Inc.

Google

... Python also shines when it comes to code maintenance.


Without a lot of documentation, it is hard to grasp what is
going on in Java and C++ programs and even with a lot of
documentation, Perl is just hard to read and maintain," says
Friedrich (Senior Project Engineer).

United Space Alliance, NASAs main shuttle support


contractor 6 / 41
Popularity

Table: TIOBE Index for October 2017

Oct 2016 Oct 2017 Language Ratings Change


1 1 Java 12.43% -6.37%
2 2 C 8.37% -1.46%
3 3 C++ 5.01% -0.79%
4 4 C# 3.86% -0.51%
5 5 Python 3.80% +0.03%
19 13 Matlab 1.159% +0.26%

Table: PYPL Popularity (Worldwide), Oct 2017 compared to a year ago:

Rank Language Share Trend


1 Java 22.2% -0.9%
2 Python 17.6% +4.3%
3 PHP 8.8% -1.0%
4 Javascript 8.0% +0.5%
5 C# 7.7% -0.9 %
7 / 41
Running Python

Python programs are executed by an interpreter.

The interpreter can be started by simply typing python in a command


shell.

We can also use IDLE environment for running python.

We can use the read-evaluation loop when the interpreter starts.

When interpreter starts it shows the >>> prompt sign.

8 / 41
Variables and Arithmetic Expression

Example (Python as a Calculator)


>>> 2+2

Example (Dynamically Typed)


>>> a = 1000
>>> a = 0.05

Remark
A variable is a way of referring to a memory location used by a computer
program. A variable is a symbolic name for this physical location.
Python is a dynamically typed language where variables names are
bound to different values, possibly of varying types, during the program
execution.
The equality sign = " should be read" or interpreted as is set to" not as
is equal to".
For x = 2, y = 2, z = 2, the id(x), id(y) and id(z) will be same.

9 / 41
Python Containers

Strings
To calculate string literals, enclose them in single, double, or triple quotes as
follows:

Example
>>> a = Hello World"; b = Python is good; c = computer says no

Lists
Lists are sequences of arbitrary objects. You create a list by enclosing values
in square brackets, as follows:

Example
>>> names = [a, b, c, d]
>>> weights = [45, 50, 70, 55]

10 / 41
Python Containers Cont...

Tuples
You create a tuple by enclosing a group of values in parentheses. Unlike lists
the content of the tuple cannot be modified after creation.

Example
>>> stock = GOOG, 100, 490.10
or by using
>>> stock = (GOOG, 100, 490.1)

Sets
A set is used to contain an unordered collection of objects. Unlike lists and
tuples, sets are unordered and cannot be indexed by numbers. Moreover set
contains unique elements.

Example
>>> s= set([1,1,2,3,4,5,3])
>>> s
set([1, 2, 3, 4, 5])
11 / 41
Python Containers Contd...

Dictionaries
A dictionary is an associative array that contains objects indexed by keys. A
dictionary can be created as follows:

Example
>>> stock = {name: GOOG, shares: 100, price: 200.10}
>>> stock[date] = 18 Nov 2017

Remark
Essentially containers are those python objects which have a __contains__
method defined.

12 / 41
Conditionals

>>> temp = 25
>>> if temp > 20 and temp<28:
print pleasant"
else:
print extreme"
>>> names = [Amitabh", "Aishwarya", "Salman", "Abhishek"]
>>> for name in names:
if name[0] in AEIOU":
print name + " starts with a vowel"
else:
print name + " starts with a consonant"

13 / 41
Iteration and Looping

The most widely used looping construct is the for statement, which is used to
iterate over a collection of item.

Example
>>> for n in [1,2,3]:
print 2 to the %d power is %d " %(n, 2**n)
2 to the 1 power is 2
2 to the 2 power is 4
2 to the 3 power is 8

Same thing can be done by using the range function as follows:

Example
>>> for n in range(1,6):
print 2 to the %d power is %d " %(n, 2**n)

14 / 41
Functions and Modules

Functions
def statement is used to create a function.

Example
>>> def remainder(a,b):
q =a//b; r = a-q*b
return r

Modules
A module is a collection of classes and functions for reuse.
1 save the rem.py in the folder say C:\Users\Admin\Desktop\myModule
2 append the path of the module to sys paths list as follows:
>>> import sys
>>> sys.path.append(rC:\Users\Admin\Desktop\myModule)
3 import the module as
>>> import rem
>>> rem.remainder(10,20)
15 / 41
Python Objects and Classes

Class
Class is a group or category of things having some properties or attributes in
common and differ from others by kind, type, or quality.

Object
object is one of the instance of the class. An object can perform the methods
and can also access the attributes which are defined in the class.

16 / 41
Classes

Python Class
class Stack(object):
def __init__(self):
self.stack = [ ]
def push(self, item):
self.stack.append(item)
def pop(self):
return self.stack.pop()
def length(self):
return len(self.stack)

Remark
self represents the instance of the class. By using the "self" keyword we
can access the attributes and methods of the class in python.
"__init__" is a reserved method in python classes. It is known as a
constructor in object oriented concepts. This method called when an
object is created from the class and it allow the class to initialize the
attributes of a class. 17 / 41
18 / 41
Solving system of linear equations

Suppose you want to solve the system of linear equations

x + 2y = 5
3x + 4y = 6

We can solve it with the help of python package numpy as follows:

Example
>>> import numpy as np
>>> A = np.array([[1,2],[3,4]])
>>> b = np.array([[5],[6]])
>>> np.linalg.solve(A,b)
array([[-4. ], [ 4.5]])

Example (Finding determinant and inverse)


>>> np.linalg.det(A)
>>> np.linalg.inv(A)

19 / 41
20 / 41
Matplotlib

This is an object-oriented plotting library. A procedural interface is provided


by the companion pyplot module, which may be imported directly, e.g.:
import matplotlib.pyplot as plt

(a) 1a (b) 1b

(c) 1a (d) 1b

Figure: 3D plots using matplotlib 21 / 41


Plotting a Function

Example (Plotting a function)


Suppose we want to plot f (t) = et cos(2t).

>>> def f(t):


return np.exp(-t) * np.cos(2*np.pi*t)
>>> t1 = np.arange(0.0, 5.0, 0.1)
>>> plt.plot(t1, f(t1))
>>> plt.show()

22 / 41
Scatter Plot

Scatter plot helps in visualizing the association between two random


variables.

Example (Scatter plots)


>>> import numpy as np
>>> x = np.random.normal(0,1,1000)
>>> y = np.random.normal(0,1,1000)
>>> plt.scatter(x,y)
>>> plt.show()
>>> np.corrcoef(x,y)

Example (Linearly related rvs)


>>> x = np.random.normal(0,1,100)
>>> y = map(lambda u: 2*u+5, x)
>>> plt.scatter(x,y)

23 / 41
Histogram

Example (Standard normal to genral normal rv)


>>> import numpy as np
>>> import matplotlib.pyplot as plt
>>> mu, sigma = 100, 15
>>> x = mu + sigma * np.random.randn(10000)
>>> n, bins, patches = plt.hist(x, 50, normed=1, facecolor=g)
>>> plt.grid(True)
>>> plt.show()

Example (Standard normal directly)


>>> x = np.random.normal(100,15,10000)
>>> n, bins, patches = plt.hist(x, 50, normed=1, facecolor=r)
>>> plt.show()

24 / 41
Simulating 3D Brownian Motion

>>> import pandas


>>> import numpy as np
>>> import matplotlib.pyplot as plt
>>> from mpl_toolkits.mplot3d import Axes3D
>>> x = np.random.normal(0,1,100)
>>> x_cumsum = x.cumsum()
>>> y = np.random.normal(0,1,100)
>>> y_cumsum = y.cumsum()
>>> z = np.random.normal(0,1,100)
>>> z_cumsum = z.cumsum()
>>> fig = plt.figure()
>>> ax = fig.add_subplot(111, projection=3d)
>>> ax.plot(x_cumsum, y_cumsum, z_cumsum)
>>> plt.show()
>>> plt.savefig(\path\)

25 / 41
3D Brownian Motion

26 / 41
27 / 41
Pandas

Pandas helps to carry out your entire data analysis workflow in Python
without having to switch to a more domain specific language like R.

What People Say About Pandas

Roni Israelov, Portfolio Manager AQR Capital Management pandas


allows us to focus more on research and less on programming. We have found pandas easy
to learn, easy to use, and easy to maintain. The bottom line is that it has increased our
productivity."

David Himrod, Director of Optimization & Analytics pandas is the perfect


tool for bridging the gap between rapid iterations of ad-hoc analysis and production quality
code. If you want one tool to be used across a multi-disciplined organization of engineers,
mathematicians and analysts, look no further."

Olivier Pomel, CEO Datadog We use pandas to process time series data on our
production servers. The simplicity and elegance of its API, and its high level of performance
for high-volume datasets, made it a perfect choice for us."

28 / 41
Series and DataFrame

Example (Pandas Series)


>>> import datetime
>>> dt1 = datetime.datetime(2015,1,1)
>>> dt2 = datetime.datetime(2015,1,10)
>>> dates = pandas.date_range(dt1,dt2,freq = D)
>>> value = np.random.normal(0,1,10)
>>> ts = pandas.Series(value, dates)

Example (Pandas DataFrame)


>>> v1 = np.random.normal(0,1,10)
>>> v2 = np.random.normal(0,1,10)
>>> d = {col1: v1, col2: v2}
>>> df = pandas.DataFrame(d, index = dates)

29 / 41
30 / 41
Binomial Distribution

Example
A company drills 10 oil exploration wells, each with a 8% chance of success.
Eight of the ten wells fail. What is the probability of that happening ?
>>> import scipy.stats
>>> x = scipy.stats.binom(n=10, p=0.08)
>>> x.pmf(2)
0.14780703546361768

Solving by Simulation
>>> N = 20000
>>> x = scipy.stats.binom(n=10, p=0.08)
>>> rns = x.rvs(N)
>>> (rns == 1).sum() / float(N)

31 / 41
Cubic Spline interpolation

Suppose, we want to interpolate between discrete points obtained from the


function f (x) = exp(x) sin(x).

>>> import matplotlib.pyplot as plt


>>> from scipy import interpolate
>>> x = np.arange(0, 10, 0.5)
>>> y = np.exp(-x)*np.sin(x)
>>> f = interpolate.interp1d(x, y, kind = cubic)
>>> xnew = np.arange(0, 9, 0.05)
>>> ynew = f(xnew)
>>> plt.plot(x, y, o, xnew, ynew, -)

32 / 41
More Functions...

33 / 41
Expectation and PMF

Example (Expected number of trials to get first head)


>>> def expectedTrials(noTrials = 1000):
cnt = 0
for i in range(noTrials):
br = np.random.binomial(1, 0.5, 500)
indx = list(br).index(1)+1
cnt += indx
return float(cnt)/noTrials

Example (PMF of number of heads in two coin tosses)


>>> import matplotlib.pyplot as plt
>>> N = 500
>>> heads = numpy.zeros(N, dtype=int)
>>> for i in range(N):
>>> heads[i] = np.random.randint(low = 0, high = 2, size = 2).sum()
# will generate random integer, low-inclusive, high-exclusive
>>> plt.stem(numpy.bincount(heads), marker= o)
>>> plt.show()
34 / 41
Data reading from excel file

load data to pandas DataFrame


>>> import pandas
>>> import xlrd
>>> data = pandas.ExcelFile(rC:\Users\Admin\Desktop\USTREASURY-
REALYIELD.xls)
>>> data.sheet_names
[u 0 Worksheet10 ]
>>> df = data.parse(uWorksheet1)

35 / 41
Data Download from Google, Yahoo ! Finance

Example (Equity Data Download)


>>> import numpy as np
>>> import pandas
>>> import pandas_datareader as pdr
>>> data = pdr.get_data_yahoo(MSFT, start = 1/1/2015, end =
10/14/2015)

Example (FX Data Download)


>>> import pandas_datareader.data as web
>>>web.get_data_fred("DEXJUPS")
1

1
you can check all the symbols on the page https://research.stlouisfed.org/fred2/categories/94
36 / 41
Regression Using sklearn Package

Example (Ordinary Least Square Regression)


>>> import matplotlib.pyplot as plt
>>> import numpy as np
>>> from sklearn import datasets, linear_model
>>> import pandas
>>> import xlrd
>>> data =
pandas.ExcelFile(r/home/arun/Desktop/MyFunctions/PatientData.xlsx)
>>> data.sheet_names
>>> df = data.parse(uSheet1)
>>> Y = df[ln Urea]; X = df[Age]; X = X.reshape(len(X),1);
Y=Y.reshape(len(Y),1)
>>> plt.scatter(X, Y, color=black)
>>> regr = linear_model.LinearRegression()
>>> regr.fit(X, Y)

37 / 41
Option Pricing Using Binomial Model

>>> import sys


>>> sys.path.append(r/home/arun/Desktop/MyFunctions/)
>>> import BinomialPricing
>>> u,d, s0, vol = BinomialPricing.fetchData()
>>> noption = BinomialPricing.OptionCRR(230, 0.25, 210, 0.5, 0.04545, 5)
# Instantiate the OptionCRR calss
>>> noption.price()

38 / 41
Infosys Option Price

Using n-period Binomial Model


>>> u,d , s0 ,vol = BinomialPricing.fetchData()
>>> noption = BinomialPricing.Option(s0,u,d, 0.0611/365, 40, 960)
>>> noption.price()

Using CRR Model


>>> u,d , s0 ,vol = BinomialPricing.fetchData()
>>> annualized_vol = math.sqrt(252)*vol
s0, sigma, strike, maturity, rfr, n >>> noption =
BinomialPricing.OptionCRR(s0, annualized_vol, 960, float(40)/365, 0.0611,
100)
>>> noption.price()

39 / 41
References

Beazley, D. M. (2009). Python: Essential Reference (4th ed.) Pearson


Education, Inc.

Downey, A., Elkner, J., Meyers, C. (2002). How to think like a computer
scientist. Learning with Python. Green Tea press, 2002 (Free book)

Mannilla, L. et al., (2006). What about a simple language? Analyzing the


difficulties in learning to program. Computer Science Education, vol.
16(3): 211227.

http://matplotlib.org/

http://www.numpy.org/

http://pandas.pydata.org/

https://www.python.org/

http://www.scipy.org/

40 / 41
THANK YOU!

41 / 41

You might also like