You are on page 1of 19

Umesh P, Department of Computational Biology and Bioinformatics

1

Python Welcome Document
Python is a general purpose interpreted, interactive, object-oriented and high-level
programming language. Python as created by !uido van "ossum in the late eighties and early
nineties. Python is free to use, even for commercial products, because of its #$%-approved open
source license. Python source code is no available under the !&U !eneral Public 'icense (!P')

1. *o to install Python+
Python is available under the U"', http,--.python.org. .lso a list of all
documentation, installation, tutorials are available here.
Donload the latest- stable version of Python and %nstall it in your computer.
/or beginners, indos installation and usage are described. *ere the latest version v-2.7 is discussed.
Interactive Mode Programming
/irst e can use the interpreter mode in Python hich is more easy to use and start ith.
%n the program files you can find %D'0 (Python !U%). %D'0 stands for %ntegrated Deve'opment
0nvironment, Clic1 on it. 2his is an interactive mode of programming in Python.


Type the following tet to the right of the Python prompt and press the !nter "ey#


*ere print evaluates each e3pression in turn and rites the resulting value to standard output (in the
screen). %f you ould li1e to print a blan1 line, use print "\n"
>>>print"Hello welcome to python programming"
Umesh P, Department of Computational Biology and Bioinformatics
4








$um%ers in Python
&umbers in Python are of four types - integers, long integers, floating point and comple3
numbers.
5 03ample of an integer is 46 hich is just a hole number.
5 'ong integers are just bigger hole numbers.
5 03amples of floating point numbers (or floats for short) are 7.47 and 84.70-9. 2he 0 notation
indicates poers of 1:. %n this case, 84.70-9 means 84.7 ; 1:-9.
5 03amples of comple3 numbers are (-7<9i) and (=.> - ?.9i)
Try the following commands and print each varia%le





&ssignment operator in Python
@ariables are nothing but reserved memory locations to store values. 2his means that hen you
create a variable you reserve some space in memory.
Based on the data type of a variable, the interpreter allocates memory and decides hat can be
stored in the reserved memory. 2herefore, by assigning different data types to variables, you
can store integers, decimals, or characters in these variables.
>>> x=5
>>>k=52.3E-4
>>>n=complex (3,4)
>>>n.real
>>>n.imag
Do you Know?
%n Python, input and output are distinguished by the presence or absence of prompts (AAA). 'ines that
do not begin ith a prompt are output from the interpreter. &ote that a secondary prompt on a line
by itself in an e3ample means you must type a blan1 line6 this is used to end a multi-line command.
Umesh P, Department of Computational Biology and Bioinformatics
7

*ere e have used a simple assignment operator, (BCB) hich assigns values from right side
operands to left side operand. /or e3ample, c C a < b ill assign value of a < b into c.
$umeric Types in Python
'peration (esult
3 < y sum of x and y
3 - y difference of x and y
3 ; y product of x and y
3 - y Duotient of x and y
3 E y remainder of 3 - y
abs(3) absolute value or magnitude of x
int(3) x converted to integer
long(3) x converted to long integer
float(3) x converted to floating point
comple3(re,im) a comple3 number ith real part re, imaginary part im
po(3, y) or 3 ;; y x to the poer y





Strings in Python
$tring can be e3pressed in Python by several ays. 2hey can be enclosed in single Duotes or
double Duotes,
#ne thing to note is that in a string, a single bac1slash at the end of the line indicates that the
string is continued in the ne3t line, but no neline is added. /or e3ample




Comments are very important in any programming language, since it clarifies the code ritten.
%n Python, comments start ith the hash character, F.

>>>"This is the first sentence.\
This is the second sentence."
'This is the first sentence. This is the second sentence.'

Tip: You can use ALT+P to repeat the last command in python

>>># this is a comment

Umesh P, Department of Computational Biology and Bioinformatics
9

Try some string operations using Python





)oncatenation and repetition in python
2he plus (<) sign is the string concatenation operator, and the asteris1 (;) is the repetition operator.






Python *ists#
. list contains items separated by commas and enclosed ithin sDuare brac1ets. .ll the items
belonging to a list can be of different data type.






Python Tuples#
>>>str = 'programming'
>>>print str # Prints complete string
>>>print str[0] # Prints first character of the string
>>>print str[2:5] # Prints characters starting from 3rd to 6th
>>>print str[2:] # Prints string starting from 3rd character
>>>str1 = 'Enjoy'
>>>str2 = 'Python'
>>>print str1+str2 # combine two strings str1 and str2
>>>print str1*10 # Prints string ten times

>>>list = [ 'kerala', 257 , 2.23,0, 70.2 ]
>>>smalllist = [123, 'university']
>>>print list # Prints complete list
>>>print list[0] # Prints first element of the list
>>>print list[1:3] # Prints elements starting from 2nd to 4th
>>>print list[2:] # Prints elements starting from 3rd element
>>>print smalllist * 2 # Prints list two times
>>>print list + smalllist # Prints concatenated lists

Umesh P, Department of Computational Biology and Bioinformatics
8

. tuple is another seDuence data type that is similar to the list. . tuple consists of a number of
values separated by commas. Unli1e lists, tuples are enclosed ithin parentheses.
2he main differences beteen lists and tuples are, 'ists are enclosed in sDuare brac1ets and
their elements and siGe can be changed but elements and siGe in tuple cannot be changed.
2uples can be considered as read-only lists.



Try the following





Python Dictionary#
PythonHs dictionaries or1 li1e associative arrays or hashes found in Perl and consist of 1ey-
value pairs. Ieys can be almost any Python type, but are usually numbers or strings. @alues, on
the other hand, can be any arbitrary Python object. Dictionaries are enclosed by curly braces.



Try the following dictionary




>>>tuple = ('kerala', 257, 2.23,0, 70.2)
>>>list = [kerala, 257, 2.23,0, 70.2]
>>>tuple = ('kerala', 257, 2.23,0, 70.2)
>>>list[2]=0 # changes the third value in list
>>>tuple[2]=0 # Invalid syntax with tuple

Key------------- value
User Name------- Password
>>>dict = {'james bond': '007', 'binary':'zero and one', 'year':2011}
>>>dict['james bond']
>>>dict['binary']
>>>dict['year]

Umesh P, Department of Computational Biology and Bioinformatics
>

'perators in python
Python also uses similar operators to describe a mathematical operation. .ddition, subtraction
multiplication and division are described by <, -, ;, - respectively.
)omparison 'perators#
'perator Description
CC Chec1s if the value of to operands is eDual or not, if yes, then condition becomes true
JC Chec1s if the value of to operands are eDual or not
KA Chec1s if the value of to operands are eDual or not
A Chec1s if the value of left operand is greater than the value of right operand
K Chec1s if the value of left operand is less than the value of right operand
AC Chec1s if the value of left operand is greater than or eDual to the value of right operand
KC Chec1s if the value of left operand is less than or eDual to the value of right operand

Mem%ership 'perators#
#ne of the 1ey features of Python programming language is that, it has membership operators,
hich test for membership in a seDuence, such as strings, lists, or tuples. 2here are to
membership operators e3plained belo,
'perator Description
in
0valuates to true, if it finds a variable in the specified seDuence and false,
otherise.
not in
0valuates to true, if it does not finds a variable in the specified seDuence
and false otherise.
Try the following eample


>>>string='Hello hai and bye'
>>hai in string
>>>string.count(hai)
Umesh P, Department of Computational Biology and Bioinformatics
?

Identity 'perators#
%dentity operators compare the memory locations of to objects. 2here are to %dentity
operators e3plained belo,
'perator Description
is
0valuates true if the variables on either side of the operator point to the
same object and false otherise.
is not
0valuates false if the variables on either side of the operator point to the
same object and true otherise.












+ow to write Python programs
*ope that no you got familiar ith the synta3 of Python. &o letHs start programming in
Python.
2he procedure to save and run a Python program is as follos,
1. #pen your favorite editor (notepad, LordPad or notepad<<)
4. Lrite the program code
>>> a=5
>>> b=6
>>> c=5
>>>a is b
>>>a is c
,&)T- and TIP-
%nput /unction is an interactive built-in function that communicate ith user. 2ry the folloing command
>>>int= input('Enter an integer : ')
Using this command, user can assign value to variable int
Umesh P, Department of Computational Biology and Bioinformatics
M

7. $ave it as a file ith the filename mentioned. % follo the convention of having all Python
programs saved ith the e3tension .py
9. "un the interpreter ith command, python program.py or use %D'0 to run the programs
(here as an easiest ay to do programming, letHs do in %D'0)


2o run a python program you can use (un Module or hit -,.. (#r you can run program using command
prompt)








)ontrol statements
2o program decisions and do different things depending on different situations can be achieved by the
control statements. 2here are three control flo statements in Python - if/ for and while.
The if statement#
,&)T- and TIP-
Lhitespace at the beginning of the line is called indentation. 'eading hitespace (spaces and tabs) at the
beginning of the logical line is used to determine the indentation level of the logical line, hich in turn is
used to determine the grouping of statements. 2his means that statements hich go together must have
the same indentation. 0ach such set of statements is called a bloc1.
Nou can also use indents to have a good loo1 for the program (Use of tabs instead of spaces are
recommended
Umesh P, Department of Computational Biology and Bioinformatics
=

2he if statement contains a logical e3pression using hich data can be compared, and a decision is
made based on the result of the comparison.




*ere if statement, condition is evaluated first. %f condition is true, the statement(s) bloc1 are e3ecuted.
#therise, the ne3t statement folloing the statement(s) bloc1 is e3ecuted.






The else -tatement#
.n else statement can be combined ith an if statement. .n else statement contains the bloc1 of code
that e3ecutes if the conditional e3pression in the if statement resolves to : or a false value.










The syntax of the if statement is:
if expression:
statement(s)

int= input('Enter One or Two : ')
ifint==1:
print"Value of integer, int is 1"
ifint==2:
print"Value of integer, int is 2"

The syntax of the if statement is:
if expression:
statement(s)
else
statement(s)
int= input('Please enter 1 : ')
ifint==1:
print "Value of integer, int is 1"
else:
print "I told you to enter 1. You have entered another value"

Umesh P, Department of Computational Biology and Bioinformatics
1:

The while *oop#
2he while loop is one of the looping constructs available in Python. 2he hile loop continues
until the e3pression becomes false. 2he e3pression has to be a logical e3pression and must
return either a true or a false value




2ry the folloing e3ample





The for *oop#
2he for loop in Python has the ability to iterate over the items of any seDuence, such as a list or
a string.









$ynta3 of Lhile
while expression:
statement(s)
count = 0
while (count < 9):
print 'line number:', count
count = count + 1
print "Finished!"
$ynta3 of /or loop
foriterating_var in sequence:
statements(s)
fruits = ['banana', 'apple', 'mango']
for index in range(len(fruits)):
print 'Current fruit :', fruits[index]
print "No more fruits!"
Umesh P, Department of Computational Biology and Bioinformatics
11

The break -tatement#
2he break statement in Python terminates the current loop and resumes e3ecution at the ne3t
statement, just li1e the traditional brea1 found in C.
2he most common use for brea1 is hen some e3ternal condition is triggered reDuiring a hasty
e3it from a loop. 2he brea1 statement can be used in both hile and for loops.









,unctions in Python
. function is a bloc1 of organiGed, reusable code that is used to perform a single, related action.
Le 1no some built-in functions in Python li1e print() etc. Nou can also create your on
functions. 2hese functions are called user-defined functions.





2ry the e3ample,


for letter in 'Bioinformatics':
if letter == 'i':
break
print 'Current Letter :', letter
var = 10
whilevar> 0:
print 'Current variable value :', var
var = var -1
ifvar == 5:
break
print "End!"
SYNTAX:
deffunctionname( parameters ):
"function_docstring"
function_suite
return [expression]
def square(int):
k=int*int
print 'square of enterd value is'
return k
Umesh P, Department of Computational Biology and Bioinformatics
14

Numpy, Scipy, and Pylab
Le ill familiariGe ith three pac1ages of Python- &umpy, $cipy, andPylab, hich are the
pac1ages that help in scientific computation. &umPy and $ciPy pac1ages ere born to solve the
performance-related issues in Python, to handle multi-dimensional arrays and matrices, and to
reduce the computational time and labor of programming.
/irst of all, e need to install these pac1ages. %nstalling the pac1ages in Lindos is very easy,
here it can be done ith the help of EasyInstall pac1age. Donload EasyInstall from
Python.org and install it into your PC. !o to the file folder of EasyInstall (usually this ill be in
Python folder-$cript) and type the name of the chosen pac1age.
OPath of easy installer.e3eP O$paceP O"eDuired pac1ageP
C:\Python27\Scripts\easy_install.exe numpy
/or 'inu3 (ubuntu-debian) users, you can directly install any pac1age by typing the folloing
command on terminal,
sudo apt-get install python- [package]
sudo apt-get install python-numpy
&umpy can be imported in Python interface by typing the command fromnumpyimport *.
#nce you import the pac1age in to the Python environment, it is ready to use. "eady, $teady,
!oJJJ
>>>fromnumpy import *
>>>import numpy as np
>>> A= np.array ([[78, 41, 53], [65, 86, 49], [94, 49, 56]])

'i1e most modern programming languages, inde3ing starts from :. 2he element ?M is referred
to as .O:,:P. 2ype the folloing,
>>>A[0] # first row
>>>A[-1] # last row
>>>A[0,0] # first row, first column
>>>A[0,1] # first row, second column
>>>A[:2] # first two rows
>>>A[:,1] # second column
Umesh P, Department of Computational Biology and Bioinformatics
17

2o create an array ith elements in the range :Q==, the folloing code can be used.

AAAimport numpy as np
>>>x =np.array(range(100))

*ere is another Python magic, if you ould li1e to convert the array created just no into three
934 matrices, you can simply use the folloing command,

>>>x.reshape((4,5,5))

.lso tryx.reshape((5,4,5))and have a loo1 at 3.

2o define matrices consisting only of Geroes or ones, or an identity matri3, try the folloing,

>>>np.zeros((2,4))
>>>np.ones((2,4))
>>Anp.identity(3)

%f you ish to create a matri3 ith a dimension same as that of another matri3, but containing
elements as only Geroes, or only ones, it is very easy in PythonJ 2ry the folloing,
AAAnp.onesRli1e(.)
AAAnp.GerosRli1e(.)

2o split an array ith elements : to 1= into three, try the folloing command,

AAA3 Carray(range(4:))
>>>split=array_split(x, 3)

-cipy
Nou can load the $cipy module into python and activate all $ciPy functions by
>>> import scipy
>>>fromscipy import *

&o your Python is eDuipped ith sub pac1ages for $ignal processing, /ourier transform,
statistical analysis, and pac1ages for calculus etc. /or complete list of sub-pac1ages in $ciPy, hit
the command help('scipy') in %D'0
Umesh P, Department of Computational Biology and Bioinformatics
19

Nou can import $cipy sub-pac1ages that are needed for the program instead of all pac1ages in
$cipy.
/or e3ample,
>>> from scipy import linalg
>>>A=mat('23,24;25,26') # to create a matrix
>>>mat(A).I # Inverse of matrix A

. polynomial can be represented by AAAeq=poly1d([1,-5,6])
Now lets see some operations on the polynomial

>>>print eq # to print polynomial
>>>roots(eq) # roots of polynomial

'et us see ho the definite integrals can be find out using $cipy

>>>print eq.integ(k=4) #integral of eq with constant of integration 4

>>>print eq.deriv() # derivative of eq

Sore function in linear algebra and matri3 operations can be obtained by importing
scipy.linalg model into python
>>>from scipy.linalg import *
>>>A=matrix([[5,2,4],[-3,6,2],[3,-3,1]])
>>>A.T # Transpose of matrix
>>>A.I # Inverse of matrix A
>>>eigval,eigvect=eig(A)
>>>eigval # Eigen values of matrix A
>>>eigvect # Eigen vector of matrix A

2here are much more linear algebra functions available in this library. But e3plaining all of them
are out of scope for this practitioner or1shop. Please go through the documentation and try
yourself.
2he statistical tool bo3 consists of statistical functions. 2he data from a &umpy array or in the
ord file can be imported into python and e can find mean, median, variance, correlation etc
Umesh P, Department of Computational Biology and Bioinformatics
18

*ave a loo1 at some of the e3amples
>>>x=arange(-10.,10.,1) # define a array
>>>mean(x) # mean
>>>var(x) # variance
>>>amin(x) # minimum value in observation
>>>amax(x) # maximum value in observation
>>>std(x) # standard deviation of observation

Pyla%
Matplotlib is an object-oriented plotting library for python. %t is a S.2'.B-$cilab-li1e
application programming interface (.P%) and provides accurate high-Duality figures, hich can
be used for publication purposes.
matplotlib contains pylab interface, hich is the set of functions provided by
matplotlib.pylabto plot graph. matplotlib. pyplot is a collection of command-
style functions that helps matplotlib to or1 li1e S.2'.B.
2o start a plotting e3periment, first e need to import matplotlib.pylab
>>>import matplotlib.pyplot as plt
*ere library - matplotlib.pyplot - is imported and labeled as plt for easy future
reference of the module.
>>>import matplotlib.pyplot as plt
>>>plt.plot([ 1 , 2, 3 ,4 ], [ 4 ,3 , 2, 1 ])
>>>plt.axis([ 0 , 5 , 0 , 5])
>>>plt.show()

2he plot function accepts the plotting points as to arrays ith 3, y coordinate respectively.
Pyplot fits a straight line to the points. %f you need only a scatter diagram of the points try the
folloing code.

>>>plt.plot([ 1 , 2, 3 ,4 ], [ 4 ,3 , 2, 1 ], 'ro')
Nou can plot the graph using different colors and styles by putting an argument after the plot
function.
Umesh P, Department of Computational Biology and Bioinformatics
1>

import matplotlib.pyplot
x=arange(1.,10.,0.1)
y=x*x
plot(x,y,'g--')
show()

.fter plotting the graph, to vie it, you need to type show()command.

*ere you ill get a green line graph6 try ith r for red, y for yello etc. Le can specify shapes
ith cryptic reference such as $ for sDuare, T for triangle etc.

>>plot(x,y,'rs') # Red square
>>plot(x,y,'g^') # Green triangle

$tandard mathematical function can also be plotted. 'et us plot sine curve.

frompylab import *
x = arange(0.,10.,0.1) # to define x values
y = sin(x) # function definition
plot(x,y) # to plot
grid(True) # to show graph in grid
show() # to show the plot


pylab contains the pyplot ith numpy functionalities. %f you are importing matplotlib
library, you need to import numpy also for defining array.

Umesh P, Department of Computational Biology and Bioinformatics
1?

*anguage Processing with Python
Python is e3cellent in manipulating te3tual data. 2here are many built in functions to
manipulate te3t in Python. 'et us see some of them. 2ry the folloing code

>>> for line in open("file.txt"):
... for word in line.split():
... if word.endswith('ing'):
... print word

&o try the folloing code

>>> sent = Let me try this statement
>>> for char in sent:
... print char

2ry the folloing
str1='cell'
str1.replace('c', 's') ## To replace c with s
str1[:2] ## To print rst two letters
str1[1:] ## To print letters after rst letter
str1.count('l') ## To count letter l
str1. nd(l) ## To nd letter l
str1.lower() ## To convert all letters into small letters
str1.upper() ## To convert all letters into capital letters
str1.title() ## To convert all letters into title case
str1.rjust ( 20 ) ## To move string to 20 pt right
str1.ljust ( 20 ) ## To move string to 20 pt left




Umesh P, Department of Computational Biology and Bioinformatics
1M

"egular e3pression module in Python adds much more functionality to Python. 2he regular
e3pression module can be imported in Python by using the command Uimport reH
import re
pattern = 'Year'
text = 'Happy New Year'
match = re.search(pattern, text)
s = match.start()
e = match.end()
print 'Found "%s" in "%s" from %d to %d ("%s")' % (match.
re.pattern, match.string, s, e, text[s:e])

2o validate an email %D, different methods have been used. 'et us use regular e3pression to find
hether a particular 0mail %D is valid or not.

def validate(email):
if re.match("^.+\\@(\\[?)[a-zA-Z0-9\\-\\.]+\\.([a-zA-
Z]{2,3}|[0-9]{1,3})(\\]?)$", email):
return "This is a correct E-mail ID"
return "This is not a correct E-mail ID"

2o split into ords, try
sent= Regular expression module in Python gives more
functionality to Python
re.split(r'[ \t\n]+', sent)

*et us see some of the regular epression sym%ols

Regular expression Usage
T 2o match the beginning of a string
V 2o match the end of a string
; 2o match hether the pattern is repeated Gero or more
0 2o match hether the pattern is repeated one or more
Wb 2o match a ord boundary
Wd 2o match any numeric digit
WD 2o match any non-numeric character
Ws 2o match anyhitespace character (blan1 space, tab, etc.)
W$ 2o matchany non-hitespace character
W
2o match any alphanumeric character and the underscore
(aXbXc) 2o match e3actly one of a, b or c

Umesh P, Department of Computational Biology and Bioinformatics
1=

$atural *anguage processing with Python
&'2I is a python pac1age for building Python programs to or1 ith language data. %t provides
easy-to-use interfaces to over 8: corpora and le3ical resources such as Lord&et, along ith a
suite of te3t processing libraries for classification, to1eniGation, stemming, tagging, parsing, and
semantic reasoning.

2o have a glimpse of &'P using python, try the folloing code

>>>import nltk
>>>sent= NLTK is a python package for building Python programs to
work with language data
>>> tokens = nltk.word_tokenize(sent)

Now try the following code
>>>count=nltk.FreqDist(sent)
>>>count.tabulate()

2he nlt1 pac1age provides support for draing trees.
>>>tree2 = nltk.Tree('One', ['Two', 'Three'])
>>> tree2.draw()

You might also like