You are on page 1of 10

Strings Unless otherwise noted, the content of this course material is licensed under a Creative

Chapter 6 Commons Attribution 3.0 License.


http://creativecommons.org/licenses/by/3.0/.

Copyright 2010, Charles Severance

Python for Informatics: Exploring Information


www.py4inf.com

String Data Type >>> str1 = "Hello" Reading and >>> name = raw_input('Enter:')
>>> str2 = 'there'
• A string is a sequence of >>> bob = str1 + str2 Converting Enter:Chuck
>>> print name
>>> print bob Chuck
characters
Hellothere • We prefer to read data in
>>> apple = raw_input('Enter:')
using strings and then parse
• A string literal uses quotes ‘Hello’ >>> str3 = '123'
>>> str3 = str3 + 1 and convert the data as we Enter:100
or “Hello” >>> x = apple - 10
Traceback (most recent call last): need
Traceback (most recent call last):
• For strings, + means “concatenate” File "<stdin>", line 1, in <module>
• This gives us more control File "<stdin>", line 1, in <module>
TypeError: cannot concatenate 'str' over error situations and/ TypeError: unsupported operand
• When a string contains numbers, it and 'int' objects or bad user input type(s) for -: 'str' and 'int'
is still a string >>> x = int(str3) + 1 >>> x = int(apple) - 10
• We can convert numbers in a
>>> print x
124
• Raw input numbers must >>> print x
90
string into a number using int() be converted from strings
>>>
Looking Inside Strings A Character Too Far

• We can get at any single character in b a n a n a


>>> zot = 'abc'
a string using an index specified in 0 1 2 3 4 5
square brackets • You will get a python error if you >>> print zot[5]
>>> fruit = 'banana' attempt to index beyond the end Traceback (most recent call last):
of a string.
• The index value must be an integer >>> letter = fruit[1]
>>> print letter
File "<stdin>", line 1, in <module>
IndexError: string index out of
and starts at zero
a • So be careful when constructing range
index values and slices
• The index value can be an >>> n = 3 >>>
expression that is computed >>> w = fruit[n - 1]
>>> print letter
n

Strings Have Length Len Function


>>> fruit = "banana" A function is some stored
>>> x = len(fruit) code that we use. A
b a n a n a >>> print x function takes some input
6 and produces an output.
0 1 2 3 4 5
• There is a built-in function len that
gives us the length of a string >>> fruit = 'banana'
>>> print len(fruit) “banana” len() 6
6 (a string) function (a number)

Guido wrote this code


Len Function Looping Through Strings
>>> fruit = "banana" A function is some stored
>>> x = len(fruit) code that we use. A
>>> print x function takes some input
6 and produces an output. b
• Using a while statement and index = 0
while index < len(fruit) :
a
an iteration variable, and the n
len function, we can construct letter = fruit[index]
def len(inp): a
blah a loop to look at each of the print letter
“banana” 6 n
blah letters in a string individually index = index + 1
(a string) for x in y: (a number) a
blah
blah

Looping Through Strings Looping Through Strings

for letter in fruit :


• A definite loop using a for b • A definite loop using a for
print letter b
statement is much more a statement is much more a
elegant for letter in fruit : n elegant n
print letter a a
• The iteration variable is n • The iteration variable is index = 0
while index < len(fruit) :
n
completely taken care of by a completely taken care of by a
the for loop the for loop letter = fruit[index]
print letter
index = index + 1
Looping and Counting Looking deeper into in
• The iteration variable
“iterates” though the
sequence (ordered set) Six-character string
Iteration variable
• This is a simple loop that word = 'banana'
loops through each letter in a count = 0
for letter in word :
• The block (body) of code is
string and counts the number executed once for each for letter in 'banana' :
of times the loop encounters if letter == 'a' : value in the sequence
the 'a' character. count = count + 1 print letter
print count • The iteration variable
moves through all of the
values in the sequence

M o n t y P y t h o n
Yes 0 1 2 3 4 5 6 7 8 9 10 11
b a n a n a
Done? Advance letter
• We can also look at any >>> s = 'Monty Python'
continuous section of a string >>> print s[0:5]
print letter
using a colon operator Monty
letter
>>> print s[6:7]
for letter in 'banana' :
• The second number is one
P
beyond the end of the slice -
print letter >>> print s[6:20]
“up to but not including”
Python
• If the second number is
The iteration variable “iterates” though the string and the block beyond the end of the string,
(body) of code is executed once for each value in the sequence it stops at the end Slicing Strings
M o n t y P y t h o n
0 1 2 3 4 5 6 7 8 9 10 11 String Concatenation
>>> s = 'Monty Python' >>> a = 'Hello'
>>> print s[:2] >>> b = a + 'There'
Mon
• If we leave off the first
>>> print s[8:]
>>> print b
number or the last number of
thon • When the + operator is HelloThere
>>> c = a + ' ' + 'There'
the slice, it is assumed to be applied to strings, it
>>> print s[:] >>> print c
the beginning or end of the means "concatenation"
Monty Python Hello There
string respectively
>>>

Slicing Strings

Multiplying Strings? Using in as an Operator


>>> fruit = 'banana'
>>> 'n' in fruit
>>> zig = 'Hi'
>>> zag = zig * 3
• The in keyword can also be True
used to check to see if one >>> 'm' in fruit
>>> print zag string is "in" another string False
• While it is seldom useful, the
HiHiHi >>> 'nan' in fruit
asterisk operator applies to
strings
>>> x = ' ' * 80 • The in expression is a logical True
expression and returns True >>> if 'a' in fruit :
or False and can be used in ... print "Found it!"
an if statement ...
Found it!
>>>
String Comparison String Library
if word == 'banana':
• Python has a number of string
functions which are in the string
print 'All right, bananas.' library >>> greet = 'Hello Bob'
>>> zap = greet.lower()
if word < 'banana': • These functions which are already >>> print zap
hello bob
built into every string - we call them
print 'Your word,' + word + ', comes before banana.' >>> print greet
by appending the function to the
elif word > 'banana': string variable Hello Bob
print 'Your word,' + word + ', comes after banana.' >>> print 'Hi There'.lower()
else: • These functions do not modify the hi there
>>>
print 'All right, bananas.' original string, instead they return a
new string that has been altered

What is a string.py
Two ways to call the library
Library? def split(inp):
blah
>>> greet = 'Hello Bob' blah
• We can call string functions by >>> zap = greet.lower() • Some super developers
appending the function name to the >>> print zap in the Python world
string variable hello bob write the libraries for us def upper(inp):
>>> print 'Hi There'.lower() to use for i in blah:
• We can import the string library and hi there blah
pass the string as a parameter >>> import string • Somewhere there is a
>>> print string.lower('Hi There') file string.py with a def find(inp):
hi there bunch of def statements
blah
blah
>>> stuff = 'Hello world'
>>> type(stuff)
<type 'str'>
>>> dir(stuff)
['capitalize', 'center', 'count', 'decode', 'encode',
'endswith', 'expandtabs', 'find', 'format', 'index',
'isalnum', 'isalpha', 'isdigit', 'islower', 'isspace',
'istitle', 'isupper', 'join', 'ljust', 'lower', 'lstrip',
'partition', 'replace', 'rfind', 'rindex', 'rjust',
'rpartition', 'rsplit', 'rstrip', 'split', 'splitlines',
'startswith', 'strip', 'swapcase', 'title', 'translate',
'upper', 'zfill']

http://docs.python.org/lib/string-methods.html http://docs.python.org/lib/string-methods.html

String Library Searching a String


str.capitalize() str.replace(old, new[, count])
• We use the find() function b a n a n a
to search for a substring
0 1 2 3 4 5
within another string
str.center(width[, fillchar]) str.lower()

str.endswith(suffix[, start[, end]]) str.rstrip([chars])


• find() finds the first >>> fruit = 'banana'
>>> pos = fruit.find('na')
occurance of the substring
>>> print pos
str.find(sub[, start[, end]]) str.strip([chars]) • If the substring is not found, 2
>>> aa = fruit.find('z')
find() returns -1
str.lstrip([chars]) str.upper() >>> print aa
• Remember that string -1
position starts at zero
http://docs.python.org/lib/string-methods.html
Making everything UPPER CASE Search and Replace

>>> greet = 'Hello Bob' • The replace() function >>> greet = "Hello Bob"
• You can make a copy of a string in >>> nnn = greet.upper() is like a “search and >>> nstr = greet.replace("Bob","Jane")
>>> print nstr
lower case or upper case >>> print nnn replace” operation in
a word processor Hello Jane
HELLO BOB
• Often when we are searching for a
>>> www = greet.lower()
>>> greet = "Hello Bob"
string using find() - we first convert
>>> print www • It replaces all >>> nstr = greet.replace("o","X")
the string to lower case so we can occurrences of the >>> print nstr
search a string regardless of case hello bob search string with the HellX BXb
>>> replacement string >>>

Stripping Whitespace Prefixes


>>> line = 'Please have a nice day'
• Sometimes we want to take a >>> greet = ' Hello Bob ' >>> line.startswith('Please')
string and remove whitespace >>> greet.lstrip()
at the beginning and/or end True
'Hello Bob ' >>> line.startswith('p')
>>> greet.rstrip() line.lower().startswith('p')
• lstrip() and rstrip() to the left
' Hello Bob'
False
>>> line.lower()
and right only
>>> greet.strip() 'please have a nice day'
• strip() Removes both begin 'Hello Bob'
>>>
>>> line.lower().startswith('p')
True
and ending whitespace
Multiple Method Calls 21 31

From stephen.marquard@uct.ac.za Sat Jan 5 09:14:16 2008

line = 'Please' >>> data = 'From stephen.marquard@uct.ac.za Sat Jan 5 09:14:16 2008'
line.lower().startswith('p') >>> atpos = data.find('@')
>>> print atpos
21
>>> sppos = data.find(' ',atpos)
'please'.startswith('p') >>> print sppos
31
>>> host = data[atpos+1 : sppos]
When we use line.lower() - it returns a string than then we >>> print host
call startswith('p') on that returned string. uct.ac.za

Format Operator Breaking Strings into Parts

• The percent-sign (%) is a "formatting operator" that takes a string with • We are often presented with input that we need to break into pieces
format sequences and a list of variables to "poke" into the string
• We use the string.split() function to break a string into a list of strings

>>> camels = 42
>>> 'I have spotted %d camels.' % camels >>> abc = 'With three words'
'I have spotted 42 camels.' >>> stuff = abc.split()
>>> 'Hi %s have a nice %s!' % ('Chuck', 'week') >>> print stuff
'Hi Chuck have a nice week!' ['With', 'three', 'words']
>>> 'In %d years I have spotted %g %s.' % (3, 0.1, 'camels') >>>
'In 3 years I have spotted 0.1 camels.'
Summary
• String type • String comparison

• Read/Convert • String library

• Indexing strings [] • Searching in strings

• Slicing strings [2:4] • Replacing text

• Looping through strings with for and • Stripping white space


while
• Pulling strings apart wth slice
• Concatenating strings with +
• Format operator %
• in as an operator

You might also like