You are on page 1of 7

PY4E: DATA STRUCTUE

STRINGS
 A string is a sequence of charaters
 A string literal uses quotes “hello”
 For strings, + means “concatenate”
 When a string contains numbers, it is still a string
 We can convert nubers in a string into a number using int()

LOOKING INSIDE STRINGS


 We can get any single character in a string using an index specified in square brackets
 The index value must be an integer and starts at zero
 The index value can be an expression that is computed

A CHARACTER TOO FAR


 You will get a python error if you attempt to index beyond the end of a string
 So be careful when constructing index values and slices

Zot=”abc”
print(zot[5])
error

STRINGS HAVE LENGTH


 The built-in fuction len gives us the length of a string

Len function
A function is come stored code that we use. A function takes some input and produces
an output

LOOPING THROUGH STRINGS


 Using a while statement and an iteration variable, and the len function, we can
construct a loop to look at each of the letters in a string individually
 A definitive loop using a for statement is much more elegant
 The iteration variable is completely taken care of by the for loop

LOOKING DEEPER INTO IN


 He iteration variable “iterates” trhough the sequence (ordered set)
 The block (body) of code is executed once for each value in the sequence
 The iteration variable moves through all of the values in the sequence

SLICING STRINGS
 We can also look at any continuous section of a string using a colon operator
 The second number is one beyond the end of the slice – “up to bout not including”
 If the second number is beyond the end of the string, it stops at the end
MANIPULATING STRINGS
STRING CONCATENATION
 When the + operator is applied to strings, it means “concatenation”

USING IN AS A LOGICAL OPERATOR


 The in keyword can also e used to check to see if one string is “in” another string
 The in expression is a logical expression that returns true or false and can be in an if
statement
 It is like (==)

STRING LIBRARY
 Pythin has a number of string functions which are in the string library
 These functions are already built into every string – we invoke them by appending the
function to the string variable
 These functions do not modify the original string, instead they return a new string that
has been altered

SEARCHING A STRING
 We use the find() function to search for a substring within another string
 Find() finds the first occurrence of the substring
 If the substring is not found, find() returns -1
 Remember that string position starts at zero

SEARCH AND REPLACE


 The replace() function is like a “search and replace” operation in a word processor
 It replaces all occurrences of the search string with the replacement string

STRIPPING WHITESPACE
 Sometimes we want to take a string and remove whitespace at the beginning and/or
end
 lstrip() and rstrip() remove whitespace at the left or right
 strip() removes both beginning and ending whitespace

FILES
OPENING A FILE
 before we can read the contents of the file, we must tell python which file we are
going to work with and what we will be doing with the file
 this is donde with the open() function
 open() return a “file handle” – a variable used to perform operations on the file
 similar to “file -> open” in a word processor

USING OPEN()
 handle= open(filename,mode), ejemplo fhand=open(“mbox.txt”,“r”)
 returns a handle use to manipulate the file
 filename is a string
 mode is optional and should be “r” if we are planning to read the file and “w” if we are
going to write to the file

THE NEWLINE CHARACTER


 we use a special character called the “newline” to indicate when a line ends
 we represent it as \n in strings
 newline is still one character – not two

PROCESSING FILES
 a file handle open for read can be trated as a sequence of strings where each line in
the file is a string in the sequence
 we can use the for statement to iterate through a sequence
 remember – a sequence is an ordered set

COUNTING LINES IN A FILE


 open a file read – only
 use a for loop to read each line
 count the lines and print out the number of lines

HANDLE=OPEN(“MBOX.TXT”)
count=0
for line in handle:
count=count+1
print(“line count:”,count)

READING THE WHOLE FILE


 we can read the whole file (newlines and all) into a single string

fhand= open(“mbox-short.txt”)
inp=fhand.read()
print(len(inp))
print(inp[:20])

SEARCHING THROUGH A FILE


 we can put if statement in our for loop to only print lines that meet some criteria

fhand= open(“mbox-short.txt”)
for line in fhand:
if line.startswith(“From: “)
print(line)

SEARCHING THROUGH A FILE (FIXED)


 We can strip the whitespace from the right hand side of the string using rstrip() from
the string library
 The newline is considered a white space and is stripped
fhand= open(“mbox-short.txt”)
for line in fhand:
line= line.rstrip()
if line.startswith(“From: “):
print(line)

SKIPPING WITH CONTINUE


 we can conveniently skip a line by using the continue statement

fhand= open(“mbox-short.txt”)
for line in fhand:
line= line.rstrip()
if not line.startswith(“From: “):
continue
print(line)

USING IN TO SELECT LINES


 we can look for a string anywhere in a line as our selection criteria

fhand= open(“mbox-short.txt”)
for line in fhand:
line= line.rstrip()
if not “@uct.ac.za” in a line:
continue
print(line)

LISTS
PROGRAMING
Algorithms: A set of rules or steps used to solve a problem

Data estructure: a particular way of organizing data in a computer

WHAT IS NOT A COLLECTION?


 Most of our variables have one value in them, when we put a new value in the
variable, the old value is overwritten

X=2
x=4
print(x)
4

A LIST IS A KIND OF COLLECTION


 A collection allows us to put many values in a single variable
 A collection is nice because we can carry many values around in one convenient
package
Friends=[“joseph” , “glenn” , “sally”]
carryon=[“socks” , “shirt” , “perfume”]

LIST CONSTANTS
 List constants are sirrounded by square brackets and the elements in the list are
separated by commas
 A list element can be any python object – even another list
 A list can be empty

LOOKING INSIDE LISTS


 Just like strings, we can get at any single element in a list using an index specified in
square brackets

LISTS ARE MUTABLE


 Strings are “immutable” – we cannot change the contents of a string – we must make
a new string to make any change
 Lists are “mutable” – we can change an element of a list using the index operator

HOW LONG IS A LIST?


 The len() function takes a list as a parameter and returns the number of elements in
thelist
 Actually len() tells us the number of elements of any set or sequence (such as a
string…)

USING THE RANGE FUNCTION


 The range function returns a list of numbers that range from zero to one less than the
parameter
 We can construct an index loop using for and an integer iterator

MANIPULATING LISTS
CONCATENATING LIST USING (+)
 We can create a new list by adding two existing lists together

LIST CAN BE SLICED USING (:)


 Remember: just like in strings, the second number is “up to but not including”

BUILDING A LIST FROM SCRATCH


 We can create an empty list and then add elements using the append method
 The list stays in order and new elements are added at the end of the list

IS SOMETHING IN A LIST?
 Python provides two operators that let you check if an item is in a list
 These are logical operators that return true or false
 They do not modify the list
LIST ARE IN ORDER
 A list can hold many items and keeps those items in the order until we do soething to
change the order
 A list can be sorted – change its order
 The sort method –unlike in strings- means “sort yourself”

BUILT IN FUNCTIONS AND LISTS


 There are a number of functions built into python that take lists as parameter
 Remember the loops we built? These are much simpler

LISTS AND STRINGS


BEST FRIENDS: STRINGS AND LISTS
 Split breaks a string into parts and produces a list of strings. We think of these as
words. We can access a particular word or loop through all the words
 When you do not specify a delimiter, multiple spaces are treated like one delimiter
 Yu can specify what delimiter character to use in the splitting

THE DOUBLE SPLIT PATTERN


 Sometimes we split a line one way and then grab one of the pieces of the line and split
that piece again

Words=line.split()
email=words[1]
pieces=email.split(“@”)

DICTIONARIES
WHAT IS A COLLECTION
 A collection is nice because we can put more than one value in it and carry them all
around in one convenient package
 We have a bunch of values in a single variable
 We do this by having more than one place in the variable
 We have ways of finding the different places in the variable

A STORY OF TWO COLLECTIONS


 List: a linear collection of values that stay in order
 Dictionary: a bag of values, each with its own label

DICTIONARIES
 List index their entries based on the position in the list
 Dictionaries are like bags – no order
 So we index the things we put in the dictionary with a lookup tag

COMPARING LISTS AND DICTIONARIES


 Dictionaries are like lists except that they use keys instead of numbers to look up
values
DICTIONARY LITERALS (CONSTANTS)
 Dictionary literals use curly braes and have a list of key: value pairs
 Yu can make an empty dictionary using empty curly braces

MANY COUNTERS WITH A DICTIONARY


 One common use of dictionaries is counting how often we see something

DICTIONARY TRACEBACKS
 It is an error to reference a key which is not in the dictionary
 We can use the in operator to see if a key is in the dictionary

WHEN WE SEE A NEW NAME


 When we counter a new name, we need to add a new entry in thre dictionary and if
this the second or later time we hace seen the name, we simply add one to the count
in the dictionary under that name

THE GET METHOD FOR DICTIONARIES


 The pattern of checking to see if a key is already in a dictionary and assuing a default
value if the key is not there is so common that there is a method called get() that does
this for us

SIMPLIFIED COUNTING WITH GET()


 We can use get() and provide a default value of zero when the key is not wet in the
dictionary and then just add one

DICTIONARIES AND FILES


Counting pattern

The general patter to count the words in a line of text is to split the line into words, then loop
through the words and use a dictionary to track the count of each word independently

Definite loops and dictionaries

Even though dictionaries are not stored in order, we can write a for loop that goes through all
the entries in a dictionary, actually it goes through all of the keys in the dictionary and loopks
up the value

You might also like