Professional Documents
Culture Documents
2nd LESSON (ANKUR - PROSCHOOL) - DATA STRUCTURES IN PYTHON
2nd LESSON (ANKUR - PROSCHOOL) - DATA STRUCTURES IN PYTHON
Data structures are basically just that - they are structures which can hold some data together. In
other words, they are used to store a collection of related data.
A list is a mutable, ordered collection of objects. "Mutable" means a list can be altered after it is
created. You can, for example, add new items to a list or remove existing items. Lists are
heterogeneous, meaning they can hold objects of different types.
A list is always within square brackets and different items are separated by comma
CREATION OF LISTS
In [6]: # Alternatively, you can construct a list by passing some other iterabl
e into the list() function.
# An iterable describes an object you can look through one item at a ti
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
me, such as lists, tuples, strings and other sequences.
['L', 'i', 'f', 'e', ' ', 'i', 's', ' ', 'S', 't', 'u', 'd', 'y']
[]
Append()
You can add an item to an existing list with the list.append() function:
Insert()
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
Out[10]: ['apple', 'banana', 'orange', 'cherry']
EXTEND
You can also add a sequence to the end of an existing list with the list.extend()
function:
Remove()
Pop()
You can also remove items from a list using the list.pop() function. It removes the
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
specified index, (or the last item if index is not specified)
DEL()
You can use indexing to change the values within a list or delete items in a list:
In [15]: vowels=['a','e','i','o','u']
vowels.pop(2)
Out[15]: 'i'
In [16]: print(vowels)
del vowels[2]
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
In [17]: print(vowels)
In [19]: fruits
-----------------------------------------------------------------------
----
NameError Traceback (most recent call l
ast)
<ipython-input-19-5842b46f59fb> in <module>()
----> 1 fruits
CLEAR()
In [20]: vowels.clear()
print(vowels)
[]
COMBINING LISTS
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
combinedlist=my_list+empty_list
print(combinedlist)
We can also check the length, maximum, minimum and sum of a list with the len(), max(), min()
and sum() functions, respectively
In [22]: num_list=[1,3,5,7,9]
print(len(num_list)) # Check the length
print(max(num_list)) # Check the max
print(min(num_list)) # Check the min
print(sum(num_list)) # Check the sum
print(sum(num_list)/len(num_list)) # Check the average
5
9
1
25
5.0
*Note: Python does not have a built-in function to calculate the mean, but the numpy library we
will introduce in upcoming lessons does.
MEMBERSHIP OPERATORS
You can check whether a list contains a certain object with the "in" keyword:
In [23]: 1 in num_list
Out[23]: True
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
Add the keyword "not" to test whether a list does not contain an object:
Out[24]: False
COUNT
Count the occurrences of an object within a list using the list.count() function:
In [25]: num_list.append(3)
num_list.append(3)
num_list.append(3)
In [26]: num_list
Out[26]: [1, 3, 5, 7, 9, 3, 3, 3]
In [27]: num_list.count(3)
Out[27]: 4
new_list.sort()
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
print("Sorted List:", new_list) # Sor
t the List
Use the tab to within the () and use the REVERSE PARAMETER to sort in descending order:
#print vowels
print("Sorted Vowels:", vowels)
Lists and other Python sequences are indexed, meaning each position in the sequence has a
corresponding number called the index that you can use to look up the value at that position.
Python sequences are zero-indexed, so the first element of a sequence is at index position zero,
the second element is at index 1 and so on. Retrieve an item from a list by placing the index in
square brackets after the name of the list
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
print(another_list[0])
print(another_list[2])
Hello
bestest
If you supply a negative number when indexing into a list, it accesses items starting from the end
of the list (-1) going backward:
In [32]: print(another_list[-1])
print(another_list[-3])
friend
bestest
In [33]: print(another_list[5])
-----------------------------------------------------------------------
----
IndexError Traceback (most recent call l
ast)
<ipython-input-33-dc41c512029a> in <module>()
----> 1 print(another_list[5])
If your list contains other indexed objects, you can supply additional indexes to get items
contained within the nested objects:
In [34]: nested_list=[[1,2,3],[4,5,6],[7,8,9]]
print(nested_list[0][2]) # Extracting 3rd item
of 1st nested list
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
In [35]: nested_list=[[1,2,3],[4,5,6],[7,8,9]]
print(nested_list[2][0]) # Extracting 1st item
of 3rd nested list
You can take a slice (sequential subset) of a list using the syntax [START:STOP:STEP] where
start and stop are the starting and ending indexes for the slice and step controls how frequently
you sample values along the slice. The default step size is 1, meaning you take all values in the
range provided, starting from the first, up to but not including the last:
['my', 'bestest']
In [37]: # Slice the entire list but use step size 2 to get every second item:
my_slice=another_list[0:5:2] #Slice index 1 t
o start and index 3 to stop (index 3 is exclusive)
print(my_slice)
If you provide a negative number as the step, the slice steps backward:
['friend', 'old']
You can leave the starting or ending index blank to slice from the beginning or up to the end of
the list respectively:
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
In [39]: slice1=another_list[:4]
print(slice1)
In [40]: slice1=another_list[3:]
print(slice1)
['old', 'friend']
If you don't provide a start or ending index, you slice of the entire list:
In [41]: my_slice=another_list[:]
print(my_slice)
Using a step of -1 without a starting or ending index slices the entire list in reverse, providing a
shorthand to reverse a list:
In [42]: my_slice=another_list[::-1]
print(my_slice)
COPYING LISTS
There are ways to make a copy of a list using the list.copy() function:
In [43]: list1=[1,2,3]
list2=list1.copy()
list1.append(4)
print("List1:", list1)
print("List2:", list2)
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
List1: [1, 2, 3, 4]
List2: [1, 2, 3]
As expected, the copy was not affected by the append operation we performed on the original
list. The copy function (and slicing an entire list with [:]) creates what is known as a "shallow
copy." A shallow copy makes a new list where each list element refers to the object at the same
position (index) in the original list. This is fine when the list is contains immutable objects like ints,
floats and strings, since they cannot change. Shallow copies can however, have undesired
consequences when copying lists that contain mutable container objects, such as other lists.
Notice that when we use a shallow copy on list2, the second element of list2 and its copy both
refer to list1. Thus, when we append a new value into list1, the second element of list2 and the
copy, list3, both change. When you are working with nested lists, you have to make a "deepcopy"
if you want to truly copy nested objects in the original to avoid this behavior of shallow copies.
You can make a deep copy using the deepcopy() function in the copy module:
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
In [45]: import copy # Load the copy module
list1 = [1,2,3] # Make a list
list2 = ["List within a list", list1] # Nest it in another list
list3 = copy.deepcopy(list2) # Deep copy list2
print("Before appending to list1:")
print("List2:", list2)
print("List3:", list3, "\n")
list1.append(4) # Add an item to list1
print("After appending to list1:")
print("List2:", list2)
print("List3:", list3)
This time list3 is not changed when we append a new value into list1 because the second
element in list3 is a copy of list1 rather than a reference to list1 itself.
A tuple is a collection which is ordered and unchangeable. A tuple can also include duplicate
elements. In Python tuples are written with round bracket.
They are an immutable sequence data type that are commonly used to hold short collections of
related data. For instance, if you wanted to store latitude and longitude coordinates for cities,
tuples might be a good choice, because the values are related and not likely to change. Like
lists, tuples can store objects of different types.
CREATING TUPLES
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
Construct a tuple with a comma separated sequence of objects within parentheses:
In [46]: mytuple=(1,3,5)
print(mytuple)
(1, 3, 5)
Alternatively, you can construct a tuple by passing an iterable into the tuple() function:
In [47]: mylist=[2,3,1,4]
anothertuple=tuple(mylist)
anothertuple
Out[47]: (2, 3, 1, 4)
NOTE:
Tuples generally support the same indexing and slicing operations as lists and they also support
some of the same functions, with the caveat that tuples cannot be changed after they are
created. This means WE CAN DO things like find the length, max or min of a tuple, but WE
CAN'T APPEND new values to them OR REMOVE VALUES from them:
Tuples generally support the same indexing and slicing operations as lists and they also support
some of the same functions, with the caveat that tuples cannot be changed after they are
created. This means we can do things like find the length, max or min of a tuple, but we can't
append new values to them or remove values from them:
Out[48]: 1
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
In [49]: anothertuple[2:4] # You can slice tuples
Out[49]: (1, 4)
4
1
4
10
-----------------------------------------------------------------------
----
AttributeError Traceback (most recent call l
ast)
<ipython-input-51-ed48e70a7075> in <module>()
----> 1 anothertuple.append(1) # You can't append to a tuple
-----------------------------------------------------------------------
----
TypeError Traceback (most recent call l
ast)
<ipython-input-52-280d281472de> in <module>()
----> 1 del anothertuple[1] # You can't delete from a tuple
You can sort the objects in tuple using the sorted() function, but doing so creates a new list
containing the result rather than sorting the original tuple itself like the list.sort() function does
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
with lists:
In [53]: sorted(anothertuple)
Out[53]: [1, 2, 3, 4]
Out[54]: [4, 3, 2, 1]
Although tuples are immutable themselves, they can contain mutable objects like lists. This
means that the contents of a shallow copy of a tuple containing a list will change if the nested list
changes:
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
DATA STRUCTURE: DICTONARIES
SIGNIFICANCE OF DICTIONARIES
Sequence data types like lists are ordered. Ordering can be useful in some cases, such as if
your data is sorted or has some other natural sense of ordering, but it comes at a price. When
you search through sequences like lists, your computer has to go through each element one at a
time to find an object you're looking for. Consider the following code:
my_list = [1,2,3,4,5,6,7,8,9,10]
0 in my_list
When running the code above, Python has to search through the entire list, one item at a time
before it returns that 0 is not in the list. This sequential searching isn't much of a concern with
small lists like this one, but if you're working with data that contains thousands or millions of
values, it can add up quickly.
Dictionaries let you check whether they contain objects without having to search through each
element one at a time, at the cost of have no order and using more system memory.
In other words, it is an object that maps a set of named indexes called keys to a set of
corresponding values. Dictionaries are mutable, so you can add and remove keys and their
associated values. A dictionary's keys must be immutable objects, such as int, string or tuple, but
the values can be anything.
In Python dictionaries are written with curly brackets, and they have keys and values.
CREATE A DICTONARY
Create a dictionary with a comma-separated list of key: value pairs within curly
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
braces:
Notice that in the printed dictionary, the items don't appear in the same order as when we defined
it, since dictionaries are unordered.
ACCESS A DICTONARY
We can access the items of a dictionary by referring to its key name, inside square brackets:
In [58]: x=mydict['age']
print(x)
10
We can also use the get() method to get the same result:
In [59]: x=mydict.get('age')
print(x)
10
CHANGING VALUES
We can change the value of a specific item by referring to its key name:
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
{'name': 'Joe', 'age': 15, 'city': 'Paris'}
In [61]: mydict['Gender']='Male'
print(mydict)
REMOVING ITEMS
POP()
The pop() method removes the item with the specified key name
In [62]: mydict.pop('city')
print(mydict)
DEL()
The del keyword removes the item with the specified key name. The del keyword can also
delete the dictionary completely.
-----------------------------------------------------------------------
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
----
NameError Traceback (most recent call l
ast)
<ipython-input-63-2a0d72ed7222> in <module>()
1 del mydict
----> 2 print(mydict)
LENGTH OF A DICTONARY
In [65]: len(mydict)
Out[65]: 3
CHECKING EXISTENCE
Out[66]: True
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
In [67]: "gender" in mydict
Out[67]: False
Out[68]: True
You can access all the keys, all the values or all the key: value pairs of a dictionary with the
keys(), value() and items() functions respectively:
In [69]: mydict.keys()
In [70]: mydict.values()
In [71]: mydict.items()
Real world data often comes in the form tables of rows and columns, where each column
specifies a different data feature like name or age and each row represents an individual record.
We can encode this sort of tabular data in a dictionary by assigning each column label a key and
then storing the column values as a list.
Consider the following table: NAME AGE CITY Joe 10 PARIS Bob 15 NEW YORK Harry 20 TOKYO
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
'Paris', 'New York', 'Tokyo']}
print(mytabledict)
{'name': ['Joe', 'Bob', 'Harry'], 'age': [10, 15, 20], 'city': ['Pari
s', 'New York', 'Tokyo']}
Sets are unordered, mutable collections of objects that CANNOT CONTAIN DUPLICATES. Sets
are useful for storing and performing operations on data where each value is unique. In Python
sets are written with curly brackets.
CREATING SETS
Create a set with a comma separated sequence of values within curly braces:
In [73]: myset={1,2,3,4,5,6,7,8,9}
print(myset)
{1, 2, 3, 4, 5, 6, 7, 8, 9}
In [74]: set_list=[1,2,3,4,5,6]
mynewset=set(set_list)
print(mynewset)
{1, 2, 3, 4, 5, 6}
ADD()
To add more than one item to a set use the update() method.
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
In [75]: myset.add(10)
print(myset)
myset.update([11,12,13,14,15])
print(myset)
{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
{1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15}
REMOVE()
An item can be removed from the set using the remove method.
In [76]: myset.remove(15)
print(myset)
In [77]: myset.discard(12)
print(myset)
In [78]: myset.discard(12)
print(myset)
NOTE: Discard does not throw an error if an item is not in the set but remove throws an error in
case the item to be removed is not in the set.
Like in Dictonary,
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
pop() will remove the last item in the set.
SET OPERATIONS
The main purpose of sets is to perform set operations that compare or combine different sets.
Python sets support many common mathematical set operations like:
union,
intersection,
difference and
UNION SETS
In [79]: set1={1,3,5,6}
set2={1,2,3,4}
set1.union(set2) #Get the Union of two sets
Out[79]: {1, 2, 3, 4, 5, 6}
Out[80]: {1, 3}
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD
{5, 6}
{2, 4}
In [82]: set1={1,3,5,6}
set3={1,3,6}
print(set1.issubset(set3)) # Check whether set1 is
a subset of set3
print(set3.issubset(set1)) # Check whether set3 is
a subset of set1
False
True
NOTE: Read syntax as: whether set3 is a subset of set1; whether set1 is a subset of set3
We can convert lists into set using the set() function. Converting a list to a set drops any
duplicate elements in the list. This can be a useful way to strip unwanted duplicate items or count
the number of unique elements in a list:
In [83]: list11=[1,2,2,2,3,3,3,4,4,4,4,4,5,5,6,6,6,6,6,7]
set(list11)
Out[83]: {1, 2, 3, 4, 5, 6, 7}
In [84]: len(set(list11))
Out[84]: 7
Create PDF in your applications with the Pdfcrowd HTML to PDF API PDFCROWD