You are on page 1of 31

Strings

Python for Informatics: Exploring Information


www.pythonlearn.com
Unless otherwise noted, the content of this course material is licensed under a Creative
Commons Attribution 3.0 License.
http://creativecommons.org/licenses/by/3.0/.

Copyright 2010- Charles Severance


String Data Type >>> str1 = "Hello”
>>> str2 = 'there'
>>> bob = str1 + str2
• A string is a sequence of characters
>>> print bob
• A string literal uses quotes 'Hello' Hellothere
>>> str3 = '123'
or “Hello”
>>> str3 = str3 + 1
• For strings, + means “concatenate” Traceback (most recent call last):
File "<stdin>", line 1, in
• When a string contains numbers, it <module>TypeError: cannot
is still a string concatenate 'str' and 'int' objects
>>> x = int(str3) + 1
• We can convert numbers in a string >>> print x
into a number using int() 124
>>>
Reading and >>> name = raw_input('Enter:')
Enter:Chuck
Converting >>> print name
• We prefer to read data in Chuck
>>> apple = raw_input('Enter:')
using strings and then parse
and convert the data as we Enter:100
need >>> x = apple – 10
Traceback (most recent call last): File
• This gives us more control "<stdin>", line 1, in
over error situations and/or <module>TypeError: unsupported
bad user input operand type(s) for -: 'str' and 'int'
>>> x = int(apple) – 10
• Raw input numbers must be >>> print x
converted from strings 90
Looking Inside Strings
• We can get at any single character in b a n a n a
a string using an index specified in 0 1 2 3 4 5
square brackets
>>> fruit = 'banana'
• The index value must be an integer >>> letter = fruit[1]
>>> print letter
and starts at zero
a
• The index value can be an >>> n = 3
expression that is computed >>> w = fruit[n - 1]
>>> print w
n
A Character Too Far

>>> zot = 'abc'


• You will get a python error if you >>> print zot[5]
attempt to index beyond the end Traceback (most recent call last):
of a string. File "<stdin>", line 1, in
<module>IndexError: string index
• So be careful when constructing out of range
index values and slices >>>
Strings Have Length

b a n a n a
0 1 2 3 4 5
• There is a built-in function len that
gives us the length of a string >>> fruit = 'banana'
>>> print len(fruit)
6
Len Function
>>> fruit = 'banana' A function is some stored
>>> x = len(fruit) code that we use. A
>>> print x function takes some input
6 and produces an output.

'banana' len() 6
(a number)
(a string) function

Guido wrote this code


Len Function
>>> fruit = 'banana' A function is some stored
>>> x = len(fruit) code that we use. A
>>> print x function takes some input
6 and produces an output.

def len(inp):
blah
'banana' blah 6
for x in y: (a number)
(a string) blah
blah
Looping Through Strings

fruit = 'banana' 0b
• Using a while statement and index = 0 1a
an iteration variable, and the while index < len(fruit) : 2n
len function, we can construct letter = fruit[index] 3a
a loop to look at each of the print index, letter 4n
letters in a string individually index = index + 1 5a
Looping Through Strings

• A definite loop using a for b


statement is much more a
elegant fruit = 'banana' n
for letter in fruit : a
• The iteration variable is print letter n
completely taken care of by a
the for loop
Looping Through Strings
fruit = 'banana'
• A definite loop using a for for letter in fruit : b
statement is much more print letter a
elegant n
a
• The iteration variable is index = 0
while index < len(fruit) : n
completely taken care of by a
the for loop letter = fruit[index]
print letter
index = index + 1
Looping and Counting

• This is a simple loop that loops word = 'banana'


count = 0
through each letter in a string
and counts the number of for letter in word :
times the loop encounters the if letter == 'a' :
'a' character. count = count + 1
print count
Looking deeper into in
• The iteration variable
“iterates” though the
Six-character string
sequence (ordered set) Iteration variable
• The block (body) of code is
executed once for each for letter in 'banana' :
value in the sequence print letter
• The iteration variable
moves through all of the
values in the sequence
Yes b a n a n a
Done? Advance letter

print letter

for letter in 'banana' :


print letter

The iteration variable “iterates” though the string and the block
(body) of code is executed once for each value in the sequence
M o n t y P y t h o n
0 1 2 3 4 5 6 7 8 9 10 11

• We can also look at any >>> s = 'Monty Python'


continuous section of a string >>> print s[0:4]
using a colon operator Mont
>>> print s[6:7]
• The second number is one
P
beyond the end of the slice -
>>> print s[6:20]
“up to but not including”
Python
• If the second number is
beyond the end of the string, it
stops at the end Slicing Strings
M o n t y P y t h o n
0 1 2 3 4 5 6 7 8 9 10 11

>>> s = 'Monty Python'


>>> print s[:2]
Mo
• If we leave off the first
>>> print s[8:]
number or the last number of
the slice, it is assumed to be Thon
the beginning or end of the >>> print s[:]
string respectively Monty Python

Slicing Strings
String Concatenation
>>> a = 'Hello'
>>> b = a + 'There'
>>> print b
• When the + operator is HelloThere
>>> c = a + ' ' + 'There'
applied to strings, it
means "concatenation" >>> print c
Hello There
>>>
Using in as an Operator
>>> fruit = 'banana’
• The in keyword can also be >>> 'n' in fruit
used to check to see if one True
string is "in" another string >>> 'm' in fruit
False
• The in expression is a >>> 'nan' in fruit
True
logical expression and >>> if 'a' in fruit :
returns True or False and ... print 'Found it!’
can be used in an if ...
statement Found it!
>>>
String Comparison
if word == 'banana':
print 'All right, bananas.'

if word < 'banana':


print 'Your word,' + word + ', comes before banana.’
elif word > 'banana':
print 'Your word,' + word + ', comes after banana.’
else:
print 'All right, bananas.'
String Library
• Python has a number of string
functions which are in the string
library >>> greet = 'Hello Bob'>>> zap =
greet.lower()>>> print zaphello
• These functions are already built into bob
every string - we invoke them by >>> print greet
appending the function to the string Hello Bob>>> print 'Hi
variable There'.lower()
hi there
• These functions do not modify the >>>
original string, instead they return a
new string that has been altered
>>> stuff = 'Hello world’
>>> type(stuff)<type 'str'>
>>> dir(stuff)
['capitalize', 'center', 'count', 'decode', 'encode', 'endswith', 'expandtabs',
'find', 'format', 'index', 'isalnum', 'isalpha', 'isdigit', 'islower', 'isspace',
'istitle', 'isupper', 'join', 'ljust', 'lower', 'lstrip', 'partition', 'replace', 'rfind',
'rindex', 'rjust', 'rpartition', 'rsplit', 'rstrip', 'split', 'splitlines', 'startswith',
'strip', 'swapcase', 'title', 'translate', 'upper', 'zfill']

http://docs.python.org/lib/string-methods.html
http://docs.python.org/lib/string-methods.html
String Library

str.capitalize() str.replace(old, new[, count])


str.center(width[, fillchar]) str.lower()
str.endswith(suffix[, start[, end]]) str.rstrip([chars])
str.find(sub[, start[, end]]) str.strip([chars])
str.lstrip([chars]) str.upper()

http://docs.python.org/lib/string-methods.html
Searching a
String
• We use the find() function to b a n a n a
search for a substring within 0 1 2 3 4 5
another string
>>> fruit = 'banana'
• find() finds the first >>> pos = fruit.find('na')
occurance of the substring >>> print pos
2
• If the substring is not found,
>>> aa = fruit.find('z')
find() returns -1
>>> print aa
• Remember that string -1
position starts at zero
Making everything UPPER CASE

>>> greet = 'Hello Bob'


• You can make a copy of a string in >>> nnn = greet.upper()
lower case or upper case >>> print nnn
HELLO BOB
• Often when we are searching for a
>>> www = greet.lower()
string using find() - we first convert
>>> print www
the string to lower case so we can
hello bob
search a string regardless of case
>>>
Search and Replace
• The replace() function
is like a “search and >>> greet = 'Hello Bob'
replace” operation in a >>> nstr = greet.replace('Bob','Jane')
word processor >>> print nstr
Hello Jane
• It replaces all >>> nstr = greet.replace('o','X')
occurrences of the >>> print nstrHellX BXb
search string with the >>>
replacement string
Stripping Whitespace
• Sometimes we want to take a >>> greet = ' Hello Bob '
string and remove whitespace >>> greet.lstrip()
at the beginning and/or end 'Hello Bob '
>>> greet.rstrip()
• lstrip() and rstrip() to the left
' Hello Bob'
and right only
>>> greet.strip()
• strip() Removes both begin 'Hello Bob'
>>>
and ending whitespace
Prefixes

>>> line = 'Please have a nice day’


>>> line.startswith('Please')
True
>>> line.startswith('p')
False
21 31

From stephen.marquard@uct.ac.za Sat Jan 5 09:14:16 2008

>>> data = 'From stephen.marquard@uct.ac.za Sat Jan 5 09:14:16 2008’


>>> atpos = data.find('@')
>>> print atpos
21
>>> sppos = data.find(' ',atpos)
>>> print sppos
31
>>> host = data[atpos+1 : sppos]
>>> print host Parsing and
uct.ac.za Extracting
Summary
• String type
• Read/Convert
• Indexing strings []
• Slicing strings [2:4]
• Looping through strings with for and while
• Concatenating strings with +
• String operations

You might also like