Python Fundamentals

You might also like

You are on page 1of 982

y

e m
a d
A c
te
Python h B y
t
Fundamentals
M
a
t ©
g h
r i
p y
C o MathByte Academy
v
© MathByte Academy 1
y
e m
Course Overviewca d
A
te
B y
th

1
a
M
t ©
h
ir g
p y
C o
© MathByte Academy 2
y
What this course is about… e m
a d
à introduction to Python from /irst principles
A c
te
y
à you will have the knowledge and understanding to learn more on your own
B
à get insights into 3rd
th
party Python libraries in general
a
M
à learn the basics of some common 3rd Party libraries

à numerical computations
t ©
g h
i
à manipulating data sets

à charting
y r
o p
© MathByte Academy C 3
y
Content Overview e m
a d
à installing and running Python
A c
à basic principles
e
numerical types (integers, floats, booleans)
t
variables

B y
operators (arithmetic, logical, boolean)

th
à control flow
a
conditional execution

M
iteration (iterables, iterators, loops)
exception handling

t ©
h
à advanced data types sequence types (lists, tuples, strings)

r i g dictionaries and sets

p y dates and times


decimals

C o
© MathByte Academy 4
y
Content Overview e m
a d
à functional programming functions
A c
higher-order functions
closures
te
decorators
By
th
à object-oriented programming
a
custom classes
methods
M
properties

t ©
à data acquisition
g h CSV

r i JSON

p y Using REST APIs

C o
© MathByte Academy 5
y
Content Overview e m
a d
à 3rd party Libraries pytz
A c
dateutil
te
B
(http) requests y
numpy
th
a
M
pandas

©
matplotlib

h t
r i g
p y
C o
© MathByte Academy 6
y
Prerequisites e m
a d
à Windows, Mac
à needs to run Python 3.6+
A c
(recommend at least 3.8/3.9 or higher)

te
Windows 10 Mac 10.9 and higher

y
à Linux – you'll need to find/use installation instructions
B
th
à some familiarity with Terminal (Mac/Linux) / Command Prompt (Windows)

a
à will be needed to install and run Python
à just the basics
M
à how to open/terminate a shell

t ©
à change directory
à list contents of a directory

g h à create a directory

r i
y
à no prior Python knowledge needed

p
o
à prior programming knowledge helpful, but not required

© MathByte Academy C 7
y
Course Structure and Materials e m
a d
à each topic is organized in two parts
à lecture video
A c
sit back , watch, take notes if you like
à coding video
te
lean in and code along – pause video, rewind, type code!

By
à exercises
th
a
à each section (except installation section) has a set of exercises

M
à make sure you are con/ident before moving on to the next section

à downloadable course materials


t ©
h
à all coding videos have an accompanying Jupyter Notebook / resources
g
i
à contains all the code we do in the code video
r
p y
à contains any required extras (such as data sets)
à notebooks are fully annotated

C o
© MathByte Academy 8
y
e m
Installing and c a d
A
Running Pythony te
h B
t

2
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 9
y
e m
à what is Python?
a d
à language vs implementation
A c
e
à the canonical, or reference implementation of Python
t
à installing Python
By
à side by side versions of Python
th
a
à virtual environments
M
à installing 3rd party libraries
t ©
à running Python code
g h
r
à interactive mode i
p
à script mode y
C o
© MathByte Academy 10
y
e m
a d
Python A c
te
By
th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 11
y
A bit of history… e m
a d
A c
à created by Guido van Rossum in 1989 while working at CWI
(National Research Institute for Mathematics and Computer Science, Netherlands)

te
B y
à became a community driven effort, overseen by Guido
à who became the BDFL (benevolent dictator for life) (stepped down in 2018)
th
a
à many developers ("core" developers) have contributed over the years

M
à was named after the British comedy group Monty Python

t ©
(who says developers don't have a sense of humour!)

g h
à still actively developed today

r i
Python 2.0 released in 2000 à last release was 2.7 (end of life April 2020)

p y
Python 3.0 released in 2008 à 3.9 released in October 2020

C o
© MathByte Academy 12
y
What is Python? e m
a d
à Python is a language, not an application
à there are many implementations of Python A c
te
CPython
PyPy
By
th
a
à even compilers that "translate" Python code to other languages
IronPython
M
à .NET
Jython
t ©
à Java
Cython
g h à C/C++

r i
y
à the "reference" Python implementation is CPython
p
C o
© MathByte Academy 13
y
CPython e m
a d
à reference implementation
c
à https://www.python.org
à most widely used distribution of Python A
te
à open source à written in C

B y https://github.com/python/cpython
à includes the standard library
th
a
à a collection of additional functionality that goes beyond
just the Python language
M
©
à written in C and Python
t
h
à this implementation of Python and the standard library is the "of/icial"
g
implementation
r i
p
à many platforms y Linux, Windows, Mac OS, iOS, Android, PlayStation, Xbox,…

C o
© MathByte Academy 14
y
e m
a d
Installing Python A c
te
B y
th
a
M
t ©
h
ir g
p y
C o
© MathByte Academy 15
y
Installing CPython e m
a d
computer
A c
à CPython is basically a bunch of /iles, located in some directory on your

te
à one of those /iles is an executable that is used to run Python code
/iles or an interactive shell
By
th
à entire standard library is also included in these files
a
M
à to "install" CPython you simply copy these files into a directory

t ©
à possible to have multiple versions of CPython on the same computer

g h
i
à just install the files in different directories
r
y
à call the desired Python executable
p
C o
© MathByte Academy 16
y
Where to @ind installation packages e m
a d
à https://www.python.org
A c
à click on Downloads à all releases
te
B y
th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 17
y
à choose which version you want and what OS you are on e m
a d
A c
te
By
th
a
M
à or use default (which should recognize your OS)

t ©
g h
r i
p y
C o
© MathByte Academy 18
y
e m
à two videos included in this course
a d
à Windows Installation
A c
à Mac Installation
te
By
(Linux installation is same as Mac, just download for Linux OS)

th
a
M
à jump directly to your speci/ic platform video

t ©
h
use Python 3.8.1 or higher for this course
g
r i
p y
C o
© MathByte Academy 19
y
e m
a d
Virtual Environments A c
te
B y
th
a
M
t ©
h
ir g
p y
C o
© MathByte Academy 20
y
3rd Party Libraries e m
a d
à many 3rd party libraries exist
à add-ons to Python for more specialized functionalityA c
à a bunch of files
te
y
à that get added to your Python "installation"
B
th
a
à 3rd party libraries often rely on other 3rd party libraries

M
à or might have speci/ic releases for speci/ic Python versions

à this can create conflicts! t ©


g h
r i
my_app_1 à some_lib_1.0 (breaks with some_lib_1.1 and higher)

y
my_app_2 à some_lib_1.1
p
(breaks with some_lib_1.0 and lower)

C o
© MathByte Academy 21
y
Solution e m
a d
à since Python is just a directory of files
à create two copies of this directory A c
te
/usr/user1/python3.9-ENV1/
install some_lib_1.0 in here By
/usr/user1/python3.9-ENV2/
install some_lib_1.1 in here
th
a
à run your apps/shell using the appropriate Python directory
M
à /Users/user1/python3.9-ENV1/bin/python my_app.py

t ©
à /Users/user1/python3.9-ENV2/bin/python my_other_app.py

g h
r i
à typing this long path is tedious

y
à add to PATH environment variable (tells OS where to look for executables)
p
C o
© MathByte Academy 22
y
Solution e m
a d
/usr/user1/python3.9-ENV1/
A c
e
à add /usr/user1/python3.9-ENV1/ to (front of) PATH
à python my_app.py
y t
h B
/usr/user1/python3.9-ENV2/ a t
M
t ©
à remove /usr/user1/python3.9-ENV1/ from path
à add /usr/user1/python3.9-ENV2/ to (front of) PATH
à python my_other_app.py
g h
r i
y
à works but can become tedious as well!
p
C o
© MathByte Academy 23
y
Virtual Environments e m
a d
à these are used to perform the exact same steps
A c
e
à make copy of Python installation

y t
à provides scripts to "activate"/"deactivate" the environment
à unsets old path / sets new path
h B
a
à very efficient on Unix/Mac – uses symlinks
t
M
©
à little less ef/icient on Windows – actually copies the /iles
t
g h
i
à provides solution to version conflicts
r
à use them!!
p y
C o
© MathByte Academy 24
y
Creating Virtual Environments e m
a d
c
à different implementations of this have evolved over the years
A
e
à Python now has a virtual environment manager built-in
t
à we'll use that one in this course
By
à first decide which Python version to use
th
à Mac: python3.8, python3.9, etc a
M
©
à Windows: specify full path to python version

h t
i g
<python version/path> –m venv <your_env_name>
r
p y à creates a new virtual environment

C o
© MathByte Academy 25
y
Activating the Virtual Environment e m
a d
c
à activating a virtual environment essentially modifies your PATH
A
te
à when you type python on command line after activating environment

By
à you will be running version of Python located in that
environment directory
th
a
à different for Mac/Linux vs Windows M
Mac/Linux:
t ©
source <path_to_env>/bin/activate

g h
Windows:
r i <path_to_env>\Scripts\activate.bat

p y
C o
© MathByte Academy 26
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 27
y
e m
a d
Installing PackagesA c
te
B y
th
a
M
t ©
h
ir g
p y
C o
© MathByte Academy 28
y
pip: Package Installer for Python e m
a d
A c
à installing 3rd party libraries (packages) basically requires copying /iles
into the Python installation (whichever directory you want)
te
à can be done manually, but again, tedious
By
th
à instead can use another app that is installed alongside Python
à pip a
M
à easily install, update and remove packages

t ©
à uses the Python Package Index https://pypi.org

g h
i
à of/icial 3rd party repository for Python
r
p y à a repository of over 200,000 Python packages!

C o
© MathByte Academy 29
y
Installing Packages with pip e m
a d
à activate the virtual environment /irst (sets your PATH)
à pip install package_name A c
te
à can even specify versions
pip install package_name==1.3.2
By
th
pip install package_name<=1.2
a
pip install package_name>2.0 M
t ©
h
à once you have decided on a speci/ic version of a package for your

g
r i
project you should always use same version number if creating a new
environment for the same project

p y
à otherwise you could end up breaking your own code!

C o
© MathByte Academy 30
y
The requirements.txt File e m
a d
c
à use a file, alongside your code, to keep track of required packages and versions
A
requirements.txt
te
numpy==1.18.1
pandas==1.1.4
B y
à /ile name can be anything
à requirements.txt is a standard
matplotlib==3.3.3
th
convention

a
> pip install –r requirements.txt M
t ©
à easier (install everything with one command)

g h
i
à reproducibility / consistency
r
y
à documentation
p
C o
© MathByte Academy 31
y
e m
d
à for this course a requirements.txt /ile is available in your downloads
a
A
à de/ines (and pins) all speci/ic libraries we'll need c
te
à please use it so we're all using the same library versions

By
(functionality can change from version to version)

th
a
M
à will install many libraries (and their dependencies)
requests

t ©
h
pytz
numpy
r i g
pandas
p y and more…

C o
© MathByte Academy 32
y
Summary of Steps e m
a d
à create a new virtual environment
A c
à activate the environment
te
à pip install libraries (aka packages)
By
th
a
M
à don't forget to activate the environment before you pip install!

reference Python installt ©


à otherwise pip install will install these packages to your

g h
r i
p y
C o
© MathByte Academy 33
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 34
y
e m
a d
Running Python A c
te
By
th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 35
y
Python Compiler/Interpreter e m
a d
Python Python
Code
c
Intermediate
A
e
code compiler
t
(bytecode)

y
h B
a t Virtual
Machine
à Python is both a compiler M (OS/Platform

and an interpreter
t © speciCic)

g h
ri
Operating

p y System

C o
© MathByte Academy 36
y
How do we "run" Python? e m
a d
à Python is a compiler/interpreter
à reads in a chunk of code (your program)
A c
à compiles and runs it
te
y
à output is sent to your screen (console)

B
à Python can do this in two ways
th
à interactive mode
a
M
à you type a Python line/block of code and execute it immediately
à any output is immediately displayed

t ©
à continue typing/running code one line/block at a time

h
à REPL (read-eval-print-loop)
g
à script mode
r i
p y
à write all your code in /iles /irst

C o
à then execute all this code using command line

© MathByte Academy 37
y
Interactive Mode e m
a d
à activate virtual environment

A c
à start Python shell (REPL) by typing python on command line

te
à start typing Python commands
à non graphical interface
By
th
a
à perfect when working on GUI-less servers

M
à little tedious to use when you are just trying things out

à Jupyter Notebooks
t ©
h
à browser based REPL (needs to be pip installed)
g
r i
à much easier/nicer to use than command line

p y
à can save your projects into a /ile (a notebook)

C o à usually .ipynb extension

© MathByte Academy 38
y
Script Mode e m
a d
à write all your code using a text editor
A c
à run your Python program using command line
te
> python my_app.py
By
th
à traditional programs are great for
a
M
à running the same program over and over again
à better structure
t ©
h
à complex applications (web server, prediction server, libraries, …)

r i g
p y
C o
© MathByte Academy 39
y
Python IDEs e m
a d
IDE à integrated development environment
à a text editor A c
te
B y
à with many extras for easily running code, debugging, and more
à runs code using the same command line approach for scripts

th
a
M
à many popular IDEs / editors around
à PyCharm
à VSCode t ©
this is the one I personally use for development

g h
à Atom
r i
y
à Sublime Text 3
p
C o
à Spyder and more…

© MathByte Academy 40
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 41
y
e m
Python Basics c a d
A
te
By
th

3
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 42
y
e m
à some basic Python types
a d
à integers à floats à booleans
A c
te
à a brief explanation of what objects are
B y
à basic operators
th
à arithmetic operators a
à integer division and modulus M
à comparison operators t ©
g h
i
à Boolean operators
r
y
à operator precedence
p
C o
© MathByte Academy 43
y
e m
a d
Basic Data Types A c
te
B y
th
a
M
t ©
h
ir g
p y
C o
© MathByte Academy 44
y
Types e m
a d
à Entities in a program always have an associated type
à in the real world too! A c
te
John is a person
My local pharmacy is a store
By
My bank balance is a (real) number th
a
M
The number of pages in a book is an (integer) number

©
The file budget.xlsx is an Excel spreadsheet

t
h
Statements can be True or False

i g
they have a type à Boolean type (True, False)
r
p y"I am a Python dev" is True

C o "My dog likes cats" is False

© MathByte Academy 45
y
Integers e m
a d
à the int type
A c
à used to represent integral numbers: 0, 1, 100, -100, etc.
te
à integers can be of any magnitude
B y
à integers have an exact representation in Python

th
(as long as you have enough memory!)
a
M
à integer numbers can be created from a literal in the Python code
à 100
à -100 t © note how you can use underscores for
readability

g h
i
à 10_500_000
r
y
à or, as the result of some calculation (expression)
p
C oà1 + 1

© MathByte Academy 46
y
Floats e m
a d
à the float type
A c
e
à used to represent real numbers (floating point): 3.14, -1.3
t
à can use Python literals to define a float
By
à 3.14
th
à -1.3 a
à 1_234.567_876 M
t ©
à the decimal point differentiates a float from an int when using literals
1 à int
g h
r i
1.0 à float
p y
C o
© MathByte Academy 47
y
Float Representations e m
a d
Consider the decimal system: 1.234
A c
te
In the decimal (base 10) system, this is representable (exactly) using powers of 10:

1+
2
+
3
10 100 1000
+
4
B y
th
a
But not all real numbers have a finite representation
1

M 3

©
as a fraction this is exact à but not using a decimal representation

1 3
h3 t3
3
r
̇
i
= 0.333 =
g+ +
10 100 1000
+⋯

p y
o
à infinite number of fractions

C
© MathByte Academy 48
y
Integer Representations e m
a d
à computers only "know" two numbers
A c
à 0 and 1
e
à binary system, aka base 2
t
By
h
à any number in a computer is represented using powers of 2
t
the binary number 1011
a
M
can be converted to a decimal number:

t
1×2! + 1×2" + 0×2# + 1×2$©
=1+2+0+8
g h
= 11
r i
p y
C o
© MathByte Academy 49
y
Float Binary Representation e m
a d
powers of 2
A c
in the same way, floats are represented using powers of 2 and fractions of

1
+ +
0 1 1 0 1
= + +
te
2" 2# 2$ 2 4 8
B y
h
= 0.5 + 0 + 0.125 = 0.625
t
a !
à we saw that certain numbers do not have a finite decimal representation ( )
M
à same happens with binary representations!
"

0 0 0 1 1
t © 0 0 1 1
0.1 =
2
+ + + +
g
4 8 16 32 h + + + +
64 128 256 512
+ ⋯

r i
p y 0.09375

C o 0.099609375
© MathByte Academy 50
y
Floats are not always exact e m
a d
c
à bottom line: not all exact decimal numbers have an exact float representation
A
à not a limitation of Python
te
y
à all languages (incl. Excel) that use these binary fractions have that issue
B
h
⚠ be careful when comparing /loats to one another
t
a
M
à there is a data type that can handle exact representations of decimal fractions

à Decimal t ©
(we will look at this type later)

g h
i
⚠ calculations using Decimal are much slower than float
r
p y
C o
© MathByte Academy 51
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 52
y
e m
a d
Objects A c
te
By
th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 53
y
What are objects? e m
a d
à entities created by Python
A c
à they have state (data)
te
à they have methods (functionality)
B y encapsulation

th
à they often represent real world things
a
M
©
Car
State (data) Functionality
• brand à Toyota
h t • accelerate()


r i
# doors à 4 g
model à Prius LE
• brake()

p y
model_year à 2020


set_cruise_control()
left_turn_signal_on()

o
• odometer à 5_402

© MathByte Academy C 54
y
Integers are Objects e m
a d
à an int is an object
A c
à it has state – the value of the integer
te
à but it also has functionality By
th
a
à knows how to add itself to another integer
(10).__add__(100) à 110
M
t ©
à an integer object has the method __add__ used to implement addition

h
(this is not how we typically add two integers together – but that's just syntax)
g
r i
à knows how to represent itself as a string (e.g. for visual output)

p y
C o
© MathByte Academy 55
y
Floats are Objects e m
a d
à float numbers are objects too
A c
te
y
state à value
functionality à __add__
h B
other functionality too, for example:a t
M
(0.125).as_integer_ratio()

t © à 1, 8

g h
r i
p y
C o
© MathByte Academy 56
y
Everything in Python is an object e m
a d
A
à any data type we have in Python is actually an objectc
à it has state te
à it has functionality
attributes
By
th
a
attributes encompass state and functionality
M
©
some attributes are for state
t
g h
some attributes are for functionality

r i
p y
C o
© MathByte Academy 57
y
Dot Notation e m
a d
c
If an object has attributes, how do we access those attributes?
A
e
à dot notation

y t
B
car.brand à accesses the brand attribute of the car object
h
t
car.model à accesses the model attribute of the car object
a
M
For attributes that represent functionality, we usually have to call the
attribute to perform the action

t ©
à often supplying additional parameters

g h
i
car.accelerate(10, "mph")
r
y
(10).__add__(100)
p
C o
© MathByte Academy 58
y
Mutability and Immutability e m
a d
à an object is mutable if its internal state can be changed
à one or more data attributes can be changed A c
te
y
à an object is immutable if its internal state cannot be changed
B
t
à the state of the object is "set in stone"
h
In Python many data types are immutable: a • integers
M • /loats

t © •

booleans
strings

g h • …

r i
While some are mutable: • lists

p y •

dictionaries
sets à we'll cover all these in

C o • … this course

© MathByte Academy 59
y
e m
a d
A c
te
By
th
a
Coding
M
t ©
g h
r i
p y
C o
© MathByte Academy 60
y
e m
a d
Variables A c
te
By
th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 61
y
Naming Objects e m
a d
We often need to label objects with some name
A c
te
à reminds us what the object is used for

By
apy
th
account_balance

a
M
©
à allows us to use the same object in multiple parts of our code

h t
r i g
p y
C o
© MathByte Academy 62
y
Assigning Names e m
a d
c
To assign a label to an object we use the assignment operator
A
=
account_balance = 1000.0
te
apy = 0.25
By this is not the
mathematical

th equality
symbol
a
à we are assigning the label apy to the object 0.25

M
à we say apy is a reference to the float object 0.25

t ©
h
à the symbol apy is just a label currently pointing to (or referencing)

g
i
the object 0.25

y r
o p
© MathByte Academy C 63
y
References and Variables e m
a d
Another way of looking at this:
A c some object in memory

te
y
float

references
h B
0.25
the object
a t
apr
M
t ©
a label
g h
apr = 0.25
r i
y
à apr is called a variable
p
o
à but it is just a label (a symbol) that references some object in memory

C
© MathByte Academy 64
y
Variables e m
a d
So why the term variable?
A
à over time, which object a symbol references can change
c
te
a = 100
later in the program…
B y
à a is referencing the object 100

a = True th
à a is now referencing the object True
a
M
à the state of the object the symbol references can change (mutate)

t © list
a

g h 1, 2, 3 , 4

r i
à a is still referencing the same
y
we append an

o p
object, but the object's state has
changed (mutated)
element to the
list

© MathByte Academy C 65
y
How Variable Assignment Happens e m
a d
apy = 0.25
A c
LHS RHS
te
à Python evaluates the RHS first By
th
a
à then it "assigns" that result to the symbol in the LHS

M
(the LHS becomes a named reference to whatever results from the RHS)

t ©
Generally, RHS could be a more complex expression than just a literal

g h
ri
balance = 1000.0 – 50.0 à in both cases, the RHS

p y
circ = 2 * 3.14 * 1.5 is fully evaluated /irst

C o
© MathByte Academy 66
y
Using Variables e m
a d
c
Once a variable has been created, it can be used elsewhere in the program
A
pi = 3.1415
te
pi 3.1415

radius = 1
By radius 1

th
circ = 2 * pi * radius
a circ 6.283

M
©
àcirc is now a reference to the float 6.283

t
radius = 2

g h radius 2

r i
y
BUT this does not change circ
p
o
à it still points to 6.283 circ 6.283

© MathByte Academy C 67
y
Variable Naming e m
a d
à case sensitive c
apr is a different symbol than APR
A
te
à must follow certain rules
By
à should follow certain conventions
th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 68
y
Must-Follow Rules e m
a d
start with underscore (_) or letter (a-z A-Z)
A c
te
(unicode characters are actually OK, but stick to a-z A-Z)

By
th
followed by any number of underscores or letters, or digits (0-9)

var my_var index1 a index_1


M all legal

©
_var __var __add__

h
à cannot be reserved words t
r i g
True False if def and or

p y
and many more we'll come across in this course

C o
© MathByte Academy 69
y
Should-Follow Conventions e m
a d
c
PEP 8 Style Guide à typical conventions followed by most Python devs
A
te
https://www.python.org/dev/peps/pep-0008/

By
terminology:
th
camel case
a
àseparate words are distinguished by upper case letters

M
accountBalance BankAccount

snake case
t ©
à separate words are distinguished by underscores

g h
r i account_balance bank_account

p y
C o
© MathByte Academy 70
y
Should-Follow Conventions e m
a d
For standard variables:
A c
à snake case
te
à all lower case letters
By
account_balance ✅
th
account_Balance ❌
a
M
t ©
h
We'll see other conventions for other special types of objects throughout
g
this course
r i
p y
C o
© MathByte Academy 71
y
Should-Follow Conventions e m
a d
à Good idea to follow standard conventions
A c
e
à but sometimes you may want to break those conventions
t
y
à that's OK – just have a good reason, and be consistent
B
th
From the PEP 8 Style Guide: a
M
t ©
A foolish consistency is the hobgoblin of little minds.

g h (Emerson)

r i
p y
C o
© MathByte Academy 72
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 73
y
e m
a d
Arithmetic Operators A c
te
B y
th
a
M
t ©
h
ir g
p y
C o
© MathByte Academy 74
y
Terminology e m
a d
one or more values
A c
An operator is a programming language symbol that performs some operation on

Certain types of operators include:


te
à arithmetic operators
By
à comparison (or relational) operators th
a
à logical operators
M
t ©
The values an operator acts on are called operands

h
à An operator that works on a single operand is called a unary operator
g
r i
à An operator that works on two operands is called a binary operator

p y
à An operator that works on three operands is called a ternary operator

C o
© MathByte Academy 75
y
Arithmetic Operators e m
a d
Unary Operators
- Unary Minus -10 A c
te
+ Unary Plus +10

By
Binary Operators
th
+ Addition
a
10 + 20
- Subtraction M 20 - 10
* Multiplication
t © 10 * 2
/ Division
g h 10 / 2

r i
** Power (exponentiation) 2 ** 4

p y
o
à use parentheses ( and ) to group expressions

C
© MathByte Academy 76
y
Operand Types e m
a d
Arithmetic operators can act on any numerical type
int float A c
te
à as well as other types we'll encounter later
By
h
à what the operator does is actually determined by the type of the operands
t
a
M
à an operator may support mixed operand types

2 + 2
t ©
à returns an int
2 + 2.0
g h
à returns a float
5.5 * 2
r i
à returns a float
4 / 2
p y à also returns a float!

C o
© MathByte Academy 77
y
The Power Operator e m
a d
c
The power operator works just like its mathematical counterpart
A
2 ** 4
te
à 2 * 2 * 2 * 2
By
à 16 (int)
th
1 a
Recall from math: 2%& =
2& M
2 ** (-4)
t ©
à 1 / (2 ** 4)
g h
à 1 / 16
r i
p y
à 0.0625 (float)

C o
© MathByte Academy 78
y
The Power Operator e m
a d
à Python supports /loats for either operand of the ** operator
A c
e
à just like mathematical exponentiation

y t
h B
a t
à graph of / 0 = 1 '

M
à 2 ' : = 1 ' ()*(,)

t ©
g h
i
à Python also supports negative bases with real exponents
r
y
à complex numbers

p
o
à it's actually a numerical type in Python (complex)

© MathByte Academy C 79
y
How Python Implements Arithmetic Operators e m
a d
à recall: numbers are actually objects
à they have state A c
à they also have functionality
te
B y
à one of these is the __add__ method (amongst many others)

th
when we do this: a + ba where a = 10 and b = 20
M
à 10 is an int object that implements the __add__ method

t ©
h
Python actually does this to evaluate the expression:

ir g
à a.__add__(b)

p y
à this works the same way with other types

C o
© MathByte Academy 80
y
Looking ahead… e m
a d
A c
à any type can choose to implement __add__ however it wants
à Python will then use that method to evaluate type_1 + type_2
te
B
à we will see later how to create our own typesy
th
a
à we can implement __add__ to define + for our custom type

M
à we'll look at this in code, though some of the code may not make sense (yet!)

t ©
g h
r i
p y
C o
© MathByte Academy 81
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 82
y
e m
a d
Operator Precedence A c
te
B y
th
a
M
t ©
h
ir g
p y
C o
© MathByte Academy 83
y
e m
When we write an expression such as this: 2 * 10 + 5

a d
à what does it mean?
A c
(2 * 10) + 5 à 25 ? or
te
2 * (10 + 5) à 30 ?
B y
th
Python chooses this

a
why? M
à operator precedence
t ©
g h
r i
p y
C o
© MathByte Academy 84
y
Operators have precedence e m
a d
à an operator with higher precedence will bind more tightly
à fancy way of saying it will be evaluated first
A c
Precedence order with arithmetic operators
te
binary + - B y
(equal precedence – since it does not actually matter)
h
lower

* /
a t
unary + - M
higher
**
t ©
except for a unary operator to the right of **

g h
ri
2 * 10 + 5
* has higher precedence than +

à 20 + 5 à 25
p y à 2 * 10 is evaluated /irst

C o
© MathByte Academy 85
y
** has highest precedence in our previous list e m
a d
2 * 2 ** 3 à 2 * (2 ** 3) à 2 * 8
A c à 16

te
-2 ** 4 à -(2 ** 4) à -16

By
(as opposed to (-2) ** 4 à 16)

th
a
à except when unary operator is to the right of **
M
2 ** -3 à 2 ** (-3)

t © à 0.125

h
à makes sense, difficult to interpret it otherwise anyway

g
r i
p y
C o
© MathByte Academy 86
y
e m
d
A complete list of all operator precedence in Python can be found here:

a
c
https://docs.python.org/3/reference/expressions.html#operator-precedence
A
te
à my advice
B y
th
à relying on operator precedence is tricky
à very easy to introduce bugs a
M
à use parentheses

t ©
h
à it's just a few keystrokes more and will save a lot of pain later!
g
r i
p y
use (2 * 10) + 5 instead of 2 * 10 + 5

C o
© MathByte Academy 87
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 88
y
e m
a d
Integer Division and Mod A c
te
B y
th
a
M
t ©
h
ir g
p y
C o
© MathByte Academy 89
y
Let's review long division! e m
a d
43
A c
2 is the remainder
3 131
te
12
11 B y #
9
th
131 / 3 à 43
$

2 a
M
t © 43 is the integer portion of the division

Python integer division:


g h // 131 // 3 à 43

r i
p y
Remainder: use Python mod operator % 131 % 3 à 2

C o
© MathByte Academy 90
y
The // Operator e m
a d
a // b calculates the "integer portion" of a / b
à easy to understand when a and b are positive A c
te
Reality: a // b is the 4loor of a / b
B y
floor(x)
th
à the largest integer number <= x

a
-3.14
M 3.14

©
-4 -3 3 4

floor(-3.14) à -4
h t floor(3.14) à 3

r i g
y
12 / 5 à 2.4 12 // 5 à 2
-12 / 5 à -2.4
o p -12 // 5 à -3

© MathByte Academy C 91
y
The mod Operator e m
a d
Again negative numbers complicates things a bit!
A c
à I said you can use % to calculate the remainder of dividing a by b

te
à in this case, for positive integers, a and b

By
à a % b and the remainder of dividing a by b is the same
à intuitive for positive numbers
th
a
M
But % is de/ined for negative integers and even /loats as well

t ©
h
à what does that even mean?
g
r i
p y
C o
© MathByte Academy 92
y
The mod Operator e m
a d
c
#
Let's go back to our /irst example 131 / 3 à 43

A
$

131 131 131 97: 3


te
y
= /6778 +
3 3 3

h B
a / b = a // b + (a % b)/b
a t
M
à a % b = b (a / b – a // b)

t ©
à a % b = a - b (a // b)

g h
i
So a % b is de4ined as the value that satis/ies the above equation
r
y
à and that's how a % b is well-de/ined for negative values
p
o
à and even for /loats!

C
© MathByte Academy 93
y
e m
d
à this explains the "weird" (aka non-intuitive) behavior for negative numbers
a
12 % 5 à 2 12 % -5 à -3
A
-12 % 5 à 3 c -12 % -5 à -2

te
à as well as how it works for real numbers
By 12.5 % 3 à 0.5
12.5 // 3 à 4.0
th
a % b = a - b (a // b) a
M
à 12.5 % 3 = 12.5 – 3 (4.0) = 12.5 – 12.0 = 0.5

t ©
g h
à moral: be careful using "intuition" for % and // and negative values

r i
y
à fortunately most of the problems we work with involve positive integers
p
C o (more in coding video)

© MathByte Academy 94
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 95
y
e m
a d
Comparison Operators A c
te
B y
th
a
M
t ©
h
ir g
p y
C o
© MathByte Academy 96
y
à also know as relational operators e m
a d
à compares two things and yields a Boolean (bool) result
== equality comparison A c
à != for "not equal"
te
<, <=, >, >=
y
assumes the operands are comparable

B
10.5 < 100
h
à makes sense
t
a
M
hello > 100 à doesn't really make sense

t ©
à == between operands that are not comparable usually returns False

g h
r i
à <, <=, etc between non-comparable operands usually generates an Exception
à TypeError
p y (we'll come back to exceptions later)

C o
© MathByte Academy 97
y
à int and float types are comparable to each other e m
10 <= 10.9 à True
a d
A c
te
Equality between integers is straightforward

By
5 == 5 à True
h
5 == 6 à False
t
a
floats are a different story!
M
©
0.1 + 0.1 + 0.1 == 0.3 à False
t
g h
i
à in general: never use == to compare floats
r
p y
C o
© MathByte Academy 98
y
What does it mean for two objects to be equal?e m
a d
à everything in Python is an object
A c
1 is an int object
e
1.0 is a float object
t
B y
h
are 1 and 1.0 the same object? à No!

a t
à but they are the same value

M
©
à need to differentiate what equality means
t
h
à the object itself

ir g
à the "value" (or state) of the object

p y
C o
© MathByte Academy 99
y
Identity vs Value Equality of Objects e m
a d
To see if two objects are the same object à is
A c
te
à in most cases use == By
To see if two (compatible) objects are equal in value (in some sense) à ==

th
a
à we'll see situations where using is makes more sense

a = 1 M
a == b à True
b = 1.0
c = 1
t ©
a is b à False

g h c is c à True
d = 500
e = 500
y ri but…
d == e à True
à d and e are not
p
d is e à False

C o the same objects!

© MathByte Academy 100


y
Identity vs Value Equality of Objects e m
a d
the objects A c
The is operator is purely concerned with the memory address (identity) of

te
y
à is is called the identity comparison operator

B
h
The == operator, is, like +, actually implemented by the type itself
t
a
à recall: a + b actually executes a.__add__(b)
M
©
à == works the same way, using the __eq__ method
t
a == b à a.__eq__(b)
g h
r i
y
So we can de/ine what == means for custom types, by implementing __eq__

p
C o (we'll see this later in this course)

© MathByte Academy 101


y
Other Comparison Operators e m
a d
A
à other comparison operators we'll cover in this course
c
te
à membership operators: in and not in
By
à works with collection types
th
a
M
à determines membership in some collection

©
s = {1, 2, 3.14, True, 5.1} (like a mathematical set)
t
h
1 in s à True
g
r i
10 in s à False

p y
10 not in s à True

C o
© MathByte Academy 102
y
e m
a d
Coding A c
te
By
th
a
M
t ©
g h
r i
p y
C o 103
© MathByte Academy
y
e m
a d
Boolean Operators A c
te
B y
th
a
M
t ©
h
ir g
p y
C o 104
© MathByte Academy
y
e m
à in Boolean algebra we only have two values: True and False
a d
à and three basic operators: and, or, not
A c
te
à Python syntax:
By
not is a unary operator th
not True
a
not (a < b)
M
t ©
and, or are binary operators True or False

g h True and False

r i
(enabled == True) and (withdrawal <= balance)

p y
C o
© MathByte Academy 105
y
The not Operator e m
a d
à not simply reverses the Boolean value

A c
te Truth Table

a
B
not a y
True
th
False
False a True
M
t ©
g h
r i
p y
C o
© MathByte Academy 106
y
The and Operator e m
a d
à a and b is True if and only if both a and b are True
A c
e
à False otherwise

a b y ta and b
True True
h B True
True
a
False t False
False
M
True False
False

t © False False

h
notice something interesting:
g
r i
à if a is False, then a and b is always False, no matter what b is

p y
C o
© MathByte Academy 107
y
The or Operator e m
a d
à a or b is False if and only if both a and b are False
A c
à True otherwise
te
a b
By a or b
True True
th True
True a
False True
False M
True True
False
t © False False

g h
notice something interesting:

r i
à if a is True, then a or b is always True, no matter what b is

p y
C o
© MathByte Academy 108
y
Short-Circuited Evaluation e m
à left and right operands are not restricted to values
a d
à can be expressions too
A c
e.g. sin(a) > 0 and cos(a) < 0
given a value a calculate sin(a)
te
à evaluate sin(a) > 0

By à result_1
calculate cos(a)
th à evaluate cos(a) < 0
à result_2
a
evaluate result_1 and result_2
M
à 4 calculations plus the and operation

t ©
à but what if result_1 had been False (i.e. sin(a) was not positive)?

g h
à recall: if a is False, then a and b is always False, no matter what b is

r i
à irrespective of what cos(a) < 0 evaluates to, the result will always be False

p y
à so if the left operand evaluates to False, we don't even to

o
calculate the right operand to get an answer

C
à short-circuited evaluation

© MathByte Academy 109


y
Short-Circuited Evaluation e m
a d
The same happens with a or b
A c
if a is True, then result is True, irrespective of what b is
à Python returns True without evaluating b te
By
And as we just saw with a and b
th
a
if a is False, then result is False, irrespective of what b is

M
à Python returns False without evaluating b

t ©
g h
à short-circuited evaluation

r i
p y
à can be very useful
à will see examples of this in section on conditional execution

C o
© MathByte Academy 110
y
Example of Short-Circuiting Usefulness e m
a d
signal (True/False)
A c
à suppose we have some trading algorithm that can calculate some buy

e
à the catch is that the calculation is complex and resource intensive
t
y
à in addition, we only want to place an order if the exchange is open
B
we could write some code to do this:
th
a
if calc_signal(symbol) and exchange_open(symbol):
buy(symbol)
M
©
à problem: when exchange is closed we needlessly calculate the signal
t
h
à but because of short-circuiting we can write:
g
r i
if exchange_open(symbol) and calc_signal(symbol):
buy(symbol)
p y
o
à this way if exchange is closed, we don't even calculate the signal

© MathByte Academy C 111


y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 112
y
e m
Conditional Execution c a d
A
te
B y
th

4
a
M
t ©
h
ir g
p y
C o
© MathByte Academy 113
y
e m
a d
à one of the fundamental constructs in programming is conditional execution

à if something is true A c
te
à run some code
By
à else (optionally)
th
a
M
à run some other piece of code

t ©
g h
r i
p y
C o
© MathByte Academy 114
y
For example, for an ATM withdrawal: e m
a d
à dispense cash A c
à if amount does not exceed available funds and does not exceed daily limit

à print receipt te
à otherwise
By
à deny request th
a
M
à display some text on screen

©
à print slip containing reason

t
g h
r i
p y
à this is the primary reason we studied conditional expressions in the last chapter!

C o
© MathByte Academy 115
y
e m
a d
if… else… A c
te
By
th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 116
y
The if Statement e m
a d
note the colon!
if <expression evaluates to True>:
code line 1
A c
code line 2
te
notice how this code block is indented

By
à this tells Python that all these lines should

th
be executed if the condition is True
a
M
à you "exit" a code block by unindenting your code

t ©
à Python uses code indentation to group together chunks of code
à called code blocks
g h
r i
y
à if you are familiar with other languages such as Java or C/C++, this is
p
C o
equivalent to using braces {} 117

© MathByte Academy
y
Examples e m
a d
price = 200
if price < 250:
A c
make_purchase()
te
By
th
à the call to make_purchase() will only be executed if price < 250
a
evaluates to True, which in this case it is

M
price = 300
t ©
if price < 250:
g h
make_purchase()
r i
p y
à in this case make_purchase() is not executed

C o
© MathByte Academy 118
y
Beware! e m
a d
à unindenting code from a block, "exits" the block
à the following is a common mistake A c
te
price = 150
if price < 100:
By
print('price is below 100, buying…')
th this is the
make_purchase() a code block

à price < 100 is False M


t ©
à does not run code in the if block

g h
i
(if block only contains a single statement – the print statement)
r
p y
à runs make_purchase() à bug!!

C o
© MathByte Academy 119
y
The else Clause e m
a d
à often in conditional execution
A c
if something is True
te
à do something
otherwise By
th
à do something else
a
M
à Python's if statement supports an else clause à it is optional

if <expression is True>: t ©
[Code Block 1]
g h note how else is unindented

i
from the if block, and
else:
y r
[Code Block 2]
followed by a colon

o p indent to form the else block

© MathByte Academy C 120


y
Example e m
a d
price = 200

A c
if price < 250:
te
print('The price is right!')
else:
By
print('Too pricey!')
th
notice this line of code is unindented
print('Done.')
a
à it has nothing to do with the

M
if or else blocks
à it will always execute
à price < 250 is True
t ©
g h
r i
à the if block is executed The price is right

y
à the else block is skipped
p
o
à code resumes after the else block Done

© MathByte Academy C 121


y
Example e m
a d
price = 300

A c
if price < 250:
te
print('The price is right!')
else:
By
print('Too pricey!')
th
print('Done.')
a
M
à price < 250 is False
à the if block is skipped t ©
g h
r i
à the else block is executed Too pricey!

p y
à code resumes after the else block Done.

C o
© MathByte Academy 122
y
Nested if Statements e m
a d
in the else block A c
à sometimes we need to nest conditional logic, either in the if block or

te
if price < 1000: y
à if price < 1000 is True
B
h
if price < 500: à if price < 500
volume = 50
else: à otherwise a t
à set volume to 50

volume = 10
M
à set volume to 10
make_purchase(volume)
else:
t ©
à purchase specified volume
à otherwise
print('Too pricey!')
g h à Too pricey

r i
p y
C o
© MathByte Academy 123
y
Nested if Statements e m
a d
à the nesting can occur in the else block too
A c
if price < 1000:
te
make_purchase()
else:
By
if price < 2000:
th
contact_vendor()
a
else:
M
©
find_new_vendor()

h t
r i g
à can nest to any number of levels

y
à too much nesting can make code hard to read!

p
à keep it to a minimum
o
© MathByte Academy C 124
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 125
y
e m
a d
elif A c
te
By
th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 126
y
Multi-Level if Statements e m
a d
c
Consider this example to calculate a grade letter given a numeric grade:
A
if grade >= 90:
te
grade_letter = 'A'
else:
By
if grade >= 80:
th
grade_letter = 'B'
else: a
à that's a lot of nesting!

if grade >= 70: M


à hard to read (for humans)
grade_letter = 'C'
else: t ©
g
if grade >= 60: h
r i
grade_letter = 'D'
else:
p y
o
grade_letter = 'F'

© MathByte Academy C 127


y
The elif Clause e m
a d
c
Instead of this nested structure, Python provides an elif clause
A
à equivalent to a nested else-if
te
à does not require this double indentation
By
à easier to read!
th once an if or elif
a clause executes (is True)
if grade >= 90:
grade_letter = 'A' M à no other if, elif or
elif grade >= 80:
t © else block executes
grade_letter = 'B'
g h
else:
r i
y
grade_letter = 'F'
else executes if no if or

o p elif statement executed

© MathByte Academy C 128


y
Grade Letter Example e m
a d
if grade >= 90:
grade_letter = 'A'
A c
if grade >= 90:
grade_letter = 'A'
e
else:
if grade >= 80:
grade_letter = 'B'
y t
elif grade >= 80:
grade_letter = 'B'
else:

h B
elif grade >= 70:
t
if grade >= 70:
grade_letter = 'C'
a
grade_letter = 'C'
else: elif grade >= 60:
if grade >= 60:
M
grade_letter = 'D'
grade_letter = 'D'
else:

t ©
grade_letter = 'F'
else:
grade_letter = 'F'

g h
yri à much more human readable!

o p
© MathByte Academy C 129
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 130
y
e m
a d
c
Ternary Conditional Operator
A
te
By
th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 131
y
Terminology e m
a d
unary operator
A c
à an operator that takes a single operand

e
à operator usually a prefix to the operand
t
à -x
By
binary operator
th
à an operator that takes two operands
a
à usually operands are on either side of the operator
àx + y M
t ©
ternary operator
h
à an operator that takes three operands
g
r i !
p y à so how do we write that?

C o
© MathByte Academy 132
y
e m
d
à suppose we have an operator that takes three operands: a, b, c
a
à the goal is for the operator to return a + (b * c)
A c
te
à this is a thing – it's called the Multiply-Accumulate operator (MAC)
(but not available in Python!)
By
th
à maybe this? a
a accmul b, c

M
à or maybe this?
©
a acc b mul c
t
g h
r i
p y
all we've done here is split the name of the operator into two
and added the operands in between

C o
© MathByte Academy 133
y
e m
This type of conditional code is often used
a d
if <conditional exp>:
var = value1
A c
else:
te
var = value2

By
th
à key is that each code block is a single assignment
a
M
à to the same variable

©
à Python introduces a conditional ternary operator to do this
t
g h
r i
p y
C o
© MathByte Academy 134
y
The conditional ternary operator e m
a d
(calculates) some result
A c
à remember that an operator operates on operands and returns

te
if <conditional exp>:
var = value1
By
else:
th
var = value2
a
M
©
à in this case we want the ternary operator's operands to be:

t
à the conditional expression
h
i g
à the value to return if the expression is True
r
y
à the value to return if the expression is False
p
C o
© MathByte Academy 135
y
The conditional ternary operator e m
a d
if <conditional exp>:
A c
var = value1
te
else:
var = value2
By
th
a
M
value1 if <conditional exp> else value2

t ©
à this is a single ternary operator

g h
i
à if condition is True, it returns value1
r
y
à if condition is False, it returns value2
p
C o
© MathByte Academy 136
y
Example e m
a d
if price < 100:
volume = 10 A c
else:
te
volume = 1
By
th
a
à can be re-written using a conditional ternary operator
M
©
volume = 10 if price < 100 else 1

t
g h
r i
p y
C o
© MathByte Academy 137
y
General Form e m
a d
c
à we saw examples where we used values as the return operands
A
à but it's more general than that
te
y
à the two value operands can be any expression
B
th
à the result of the expression is then used
a
<exp1> if <condition> else <exp2>
M
t ©
h
var = (a – b) if a > b else (b – a)
g
r i
p y
C o
© MathByte Academy 138
y
Short-Circuiting e m
a d
A c
Just like we saw with Boolean operators, the ternary operator also uses
short-circuit evaluation
te
<exp1> if <condition> else <exp2> By
th
à first evaluates <condition> a
M
à if it is True, evaluates and returns <exp1>

t ©
à but does not evaluate <exp2>

g h
à if it is False, evaluates and returns <exp2>

r i
à but does not evaluate <exp1>

p y
C o
© MathByte Academy 139
y
Example e m
a d
result = a / b if b != 0 else 'NaN'
A c
a = 10 à returns 2
te
b = 5
y
à b is 5, so b != 0 evaluates to True
B
à a / b is calculated and returned
th
a
a = 10 M
à this works just fine, and returns NaN
b = 0
©
à b is zero, so b != 0 evaluates to False
t
h
à a / b is not calculated
g
r i (thereby avoiding a division by zero exception)

p y
à NaN is returned

C o
© MathByte Academy 140
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 141
y
e m
Sequence Types c a d
A
te
By
th

5
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 142
y
What are Sequences? e m
à sequences are ordered collections of objects
a d
à there is a first element
A c
à there is a second element
te
à and a next one
y
à sometimes called the sequential order
B
t
à we can index those elements using integersh
a
M
à like counting them one by one

©
à but in Python (and most other programming languages)
t
h
à numbering starts at 0

g
r i
à like anything in Python, sequences are objects

p y
à they just happen to be container type objects that contain other objects

C o
© MathByte Academy 143
y
Indexing Sequences e m
a d
n objects
c
à length of sequence

A
object, object, object, …, object
te
0 1 2
B y n-1

th
à n objects in sequence
a
M
à last element index is n-1

in this course:
t ©
g h
à first element refers to the element at index 0

r i
à second element refers to the element as index 1

p y
à last element refers to the element at index n-1 (assuming n elements in sequence)

C o
© MathByte Academy 144
y
Sequence Length e m
a d
à sequences are usually finite
A c
à but not all sequence types are
te
By
th
a
à in this course we'll stick to /inite sequences

à first element M
à last element
t ©
g h
à finite length
r i
p y
C o
© MathByte Academy 145
y
Homogeneous vs Heterogeneous Sequencese
m
a d
A c
certain sequence types can only contain objects that are all the same type
à homogeneous sequence types
te
B y
th
a
other types of sequences may contain objects that are of different type

M
à heterogeneous sequence types

t ©
h
ir g
p y
C o
© MathByte Academy 146
y
Sequence Types in this Chapter e m
a d
lists A c
à mutable heterogeneous sequence type

te
tuples
By
à immutable heterogeneous sequence type

th
strings a
à immutable homogeneous sequence type
M
t ©
g h
r i
p y
C o
© MathByte Academy 147
y
e m
a d
Lists A c
te
By
th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 148
y
The list Type e m
a d
à it is a container type
c
à it contains elements
A
à it is a sequence type
te
à elements are ordered sequentially

à lists can be heterogeneous B y


à elements are indexed

th
à lists are mutable
a
à can add, replace or remove elements

à lists have unbounded growth M à can add as many elements as we want

t © à but they are still finite


à lists are objects
g h
y ri
à they have state à the elements contained in the list

o p
à they have functionality à add element, remove element, etc

© MathByte Academy C 149


y
list Literals e m
a d
à Python lists can be created using literals
A c
[10, 20, 30, 40]
te
B y
h
note the enclosing square brackets []
t
a
à this is what indicates the type is a list

à this list is homogeneous M


à all elements are integers

à but they don't have to be


t ©
[10, 3.14, True]

g h
y ri
à they can even be nested [10, 20, 30, [True, False]]

o p the last element of this list is itself a list

© MathByte Academy C 150


y
Accessing list Items by Index e m
a d
à lists are sequence types à sequential order

A c
à indexable

l = [10, 20, 30, 40, 50]


te (length is 5)
index 0 1 2 3 4

By
à we can reference an element by its index
th l[i]
l[0] à 10 a
l[1] à 20 M
l[2] à 30
t ©
g h
⚠ r i
Trying to access a list by index greater than last index will cause an exception!
l[5]
p y
à IndexError

C o
© MathByte Academy 151
y
Sequence Length e m
a d
l = [10, 20, 30, 40, 50]
A c
à visual inspection à length of l is 5
te
B y
h
à but we can use code to calculate this for us
t
à the len function a
M
len(l) à 5

t ©
h
len([True, False]) à 2
g
r i
p y
C o
© MathByte Academy 152
y
Empty Lists e m
a d
sometimes we want to start with an empty list
A c
and have code that adds to the list as our program runs
te
y
à to create an empty list we can just use a literal
B
l = [] th
a
then len(l) à 0 M
t ©
⚠ l[0] à IndexError
h
r i g
p y
C o
© MathByte Academy 153
y
Replacing a list Element e m
a d
l = [10, 20, 30, 40, 50]
A c
à we can retrieve elements by index
te
print(l[2]) à 30
y
B
h=
à but we can also replace an element at index i with a different element
t
à we use the assignment operator
a
l[2] = True M
l = [10, 20, True, 40, 50]
t ©
print(l[2])
g h
à True

r i
⚠ l[5] y
we are replacing elements – so the index must be valid!
p
C o= 100 à IndexError

© MathByte Academy 154


y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 155
y
e m
a d
Tuples A c
te
By
th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 156
y
The tuple Type e m
a d
à very similar to the list type
A c
à it is a container type
te
à it is a sequence type
By
th
à tuples can be heterogeneous
a
à BUT… M
they are an immutable container type

t ©
h
à unlike lists, once a tuple has been created

g
r i
à cannot add or remove elements

p y
à cannot replace elements

C o
© MathByte Academy 157
y
tuple Literals e m
a d
à Python tuples can be created using literals
A c
(10, 20, 30, 40)
te
By
note the enclosing round brackets ()
à this indicates the collection is a tuple

th
a
M
à just like lists, they can can contain any object, including another tuple

(10, 20, (3, 4))


t ©
g h
i
(10, 20, (True, False), [100, 200])
r
p y
C o
© MathByte Academy 158
y
tuple Literals e m
a d
à often we don't even need the ()
A c
à Python interprets a comma separated list of elements as a tuple
te
à so we can write (10, 20, 30)
à or just By
10, 20, 30
th
a
à both these code snippets result in t being a tuple
M
©
t = (10, 20, 30)
t = 10, 20, 30
h t
r i g
p y
C o
© MathByte Academy 159
y
tuple Literals e m
a d
à just like lists, tuples can contain any object
à including other tuples or lists A c
te
(1, [True, False], (3, 4))
B y
list tuple
th
a
tuple
M
©
à we can omit the parentheses on the outer tuple
t
1, [True, False], (3, 4)
g h
r i
à but not (3, 4)
p y
o
1, [True, False], 3, 4

C
à not the same

© MathByte Academy 160


y
Indexing, Length e m
a d
c
à just like lists, elements can be read back from a tuple using an index number
A
à the len() function works with tuples also
te
t = 10, 20, 30, 40, 50 By
th
a
M
len(t) à 5

t[0] à 10
t ©
t[2] à 30
g h
r i
y
t[5] à IndexError

p
C o
© MathByte Academy 161
y
tuples are Immutable e m
a d
à unlike lists, we cannot replace an element of a tuple
A c
t = 10, 20, 30
te
t[0] = 100 à TypeError
By
th
à the container is immutable
a
M
à does not mean elements in the container are immutable

t = 10, 20, [True, True]


t © last element is a list
à which is mutable
t[2] = 100
g h
à TypeError

r i
p
t[2][1] = Falsey t à 10, 20, [True, False]

C o
© MathByte Academy 162
y
Creating Empty tuples e m
a d
à not very useful, so not used very often
A c
à use empty parentheses
te
t = () By
th
a
à that tuple is immutable, so it will remain empty for its lifetime
M
t ©
g h
r i
p y
C o
© MathByte Academy 163
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 164
y
e m
a d
Strings A c
te
By
th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 165
y
The str Type e m
a d
à this is also a container type
A c
à it is a sequence type
te
By
à strings are homogeneous à they can only contain characters (unicode)
th
à they are immutable a
M
t ©
g h
r i
p y
C o
© MathByte Academy 166
y
str Literals e m
a d
à Python strings can be created using literals
A c
'this is a string'
te
y
note the enclosing quotes '…'
B
à can also use double quotes
th
"this is a string" a
M note the enclosing double quotes "…"

t ©
h
à these quotes/double-quotes are called the string delimiters
g
r i
p y
à an empty string literal can be '' or ""

C o
© MathByte Academy 167
y
Indexing, Length e m
a d
à works the same way as any sequence type
A c
e
à use an index number to access elements of the string
t
y
à use the len() function to find the length of the string

B
s = 'Python'
th
len(s) à 6 a
M
(a string containing a single character)
©
s[0] à 'P'
s[1] à 'y'
h t
s[5] à 'n'
r i g
y
s[6] à IndexError

p
C o
© MathByte Academy 168
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 169
y
e m
a d
Slicing A c
te
By
th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 170
y
e m
d
à slicing is a way to extract ranges of elements from a sequence
a
à start position (by index number)
A c
à stop position (by index number)
te
By
[start:stop]
th
a
M
à start index is inclusive of the element

t ©
à stop index is exclusive of the element

g h
i
à slices are the same type as the sequence being sliced
r
p y
C o
© MathByte Academy 171
y
e m
l = [10, 20, 30, 40]
0 1 2 3
a d
l[0] à 10 l[1] à 20 l[2] à 30
A c l[3] à 40

l[0:2] à starts at 0, and includes element at 0


te
B
à ends at 2, but excludes element at 2y
th
l[0:2] à [10, 20]
a
à result is also a (new) list
l[1:3] à [20, 30]
M
t ©
t = (10, 20, 30, 40)
g h
i
0 1 2 3

y
t[0:2] à (10, 20) r à result is also a (new) tuple

o p
t[1:3] à (20, 30)

© MathByte Academy C 172


y
e m
d
à str type is a sequence type à slicing for strings works the same way

a
s = 'Isaac Newton'
0 1 2 3 4 5 6 7 8 9 10 11
A c
te
s[0:4] à 'Isaa'

By
s[0:5] à 'Isaac'
th
a
M
s[6:9] à 'New'

t ©
g h
r i
p y
C o
© MathByte Academy 173
y
Including Last Element in Slice e m
a d
s = 'Isaac Newton'
0 1 2 3 4 5 6 7 8 9 10 11 c
s[6:11] à 'Newto'
A
à how do we specify including the last element ?
te
By
à it's ok to specify indexes outside the sequence bounds!

th
à Python will automatically figure it out
a
M
s[6:12] à 'Newton'

s[6:1000] à 'Newton'
t ©
h
à we can also leave the end index blank
g
r i
à Python will interpret as "up to and including the last element"

p
s[6:] à 'Newton' y
C o
© MathByte Academy 174
y
Including First Element in a Slice e m
a d
s = 'Isaac Newton'
0 1 2 3 4 5 6 7 8 9 10 11
A c
à just specify 0 as the start
te
s[0:5] à 'Isaac'

à can also leave the start index blank


By
th
s[:5] à 'Isaac'
a
M
àthis is actually valid:
t ©
s[:] à 'Isaac Newton'

g h
i
à this made a shallow copy of the sequence

y r
à we'll come back to that in a bit

o p
© MathByte Academy C 175
y
Slicing with Steps e m
a d
à a step is a way to specify an interval when slicing a sequence
A c
s[start:stop:step]
te
[2:10:2] à start at (and include) index 2
B y
th
à end at (but exclude) index 10
à move in steps of 2
a
2 3 4 5 6 7 8 M 9 à indexes: 2 4 6 8

t ©
g h
l = [10, 20, 30, 40, 50, 60, 70, 80, 90, 100]
0 1

y ri
2 3 4 5 6 7 8 9

p
l[2:10:2] à [30 ,50 ,70 ,90]
o
© MathByte Academy C 176
y
Negative Steps e m
a d
à possible to use negative step values
A c
à starts at index start (inclusive)
te
à stops at index end (exclusive)
B y
à moves backwards
h
à so start should be greater than end
t
l = [10, 20, 30, 40, 50, 60, 70, 80, 90, 100] a
0 1 2 3 4
M 5 6 7 8 9

l[9:6:-1] à [100, 90, 80]


t ©
l[:6:-1]
g h
à [100, 90, 80]

r i
l[3::-1]
y
à [40, 30, 20, 10]
p
l[::-1]

C o à [100, 80, 80, 70, 60, 50, 40, 30, 20, 10]

© MathByte Academy 177


y
à strings are sequence types à also works for strings e m
a d
s = 'Isaac Newton'
0 1 2 3 4 5 6 7 8 9 10 11
A c
te
s[11:5:-1] à 'notweN'

By
s[:5:-1] à 'notweN'
th
s[::-1] à 'notweN caasI' a
M
s[10::-2] à 'owNcaI'

t ©
s[::-2]
h
à 'nte as'
g
r i
p y
C o
© MathByte Academy 178
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 179
y
e m
a d
Manipulating Sequences A c
te
B y
th
a
M
t ©
h
ir g
p y
C o
© MathByte Academy 180
y
e m
àmutable sequences can be modi/ied
a d
à replace elements
A c
à delete elements
te
à add elements
By
à often appended (to the end)
th
a
à can also specify where in the sequence to insert

M
t ©
g h
r i
p y
C o
© MathByte Academy 181
y
Replacing Single Elements e m
a d
A c
Replace an element at index i by assigning a new element to that index

te
l = [10, 20, 30]
By
l[1] = 'hello'
th
a
l à [10, 'hello', 30]
M
t ©
g h
r i
p y
C o
© MathByte Academy 182
y
Replacing an entire Slice e m
a d
à can also replace an entire slice
A c
t
à just assign a new collection to the slice
e
y
à slice will be replaced with elements in RHS
B
th
my_list = [1, 2, 3, 4, 5]
a
my_list [0:3] = ['a', 'b'] M
my_list [0:3] = ('a', 'b')
t © my_list à ['a', 'b', 4, 5]
my_list [0:3] = 'ab'
g h
r i
y
à Python uses the elements of the sequence in RHS when assigning to a slice
p
o
(but not when assigning using a single index)

C
© MathByte Academy 183
y
Deleting Elements e m
a d
à can delete an element by index
my_list = [1, 2, 3, 4, 5] A c
te
del my_list[1]
By
my_list à [1, 3, 4, 5]
th
a
à can delete an entire slice M
my_list = [1, 2, 3, 4, 5]
t ©
g h
i
del my_list[1:3]

y r
my_list à [1, 4, 5]

o p
© MathByte Academy C 184
y
Appending Elements e m
a d
à we can append one element
my_list = [1, 2, 3] A c
te
y
my_list.append(4)
my_list à [1, 2, 3, 4]
h B
a t
à to append multiple elements, we extend the sequence
my_list = [1, 2, 3]
M
t
my_list.extend(['a', 'b', 'c'])©
h
my_list.extend(('a', 'b', 'c'))
g
does the same thing

i
my_list.extend('abc')
r
p y
my_list à [1, 2, 3, 'a', 'b', 'c']

C o
© MathByte Academy 185
y
Inserting an Element e m
a d
à instead of appending, we can insert at some index
A c
à use sparingly – this is much slower than appending or extending

te
my_list = [2, 3, 4, 5]
By
my_list.insert(0, 100)
th
my_list à [100, 2, 3, 4, 5]
a
M
my_list = [2, 3, 4, 5]
t ©
my_list.insert(2, 100)
g h
r i
my_list à [2, 3, 100, 4, 5]

p y
à element is inserted so its position is the index - remaining elements are shifted right

C o
© MathByte Academy 186
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 187
y
e m
a d
Copying SequencesA c
te
B y
th
a
M
t ©
h
ir g
p y
C o
© MathByte Academy 188
y
Shallow vs Deep Copies e m
a d
à two types of copies
A c
e
à shallow copies
à new sequence is created
y t
(not same sequence object as original)

h B
à elements in new sequence reference the same elements as original

a t
à deep copies
M
à new sequence is created
t © (not same sequence object as original)

g h
à each element in new sequence is a deep copy of the original
r i
y
à totally new and independent objects
p
C o
© MathByte Academy 189
y
Shallow Copy e m
a d
A c
obj 1
te
y
index 0 index 0

B
index 1 obj 2 index 1
index 2
obj 3
thindex 2

a
original
M shallow_copy

t ©
original is shallow_copy à False

g h
i
à original and shallow_copy are not the same containers
r
à but the elements are referencing the same objects
y
o p
© MathByte Academy C 190
y
Shallow Copy e m
a d
index 0
index 1
obj 1

obj 2 A
index 0
index 1
c
index 2
te
index 2
obj 3

By
original
th
shallow_copy

a
à add/remove/replace element in one does not affect the other

M
index 0
t ©
obj 1 index 0
index 1
index 1
index 2
g h obj 2
index 2

ri obj 3 index 3

p
original y obj 4
shallow_copy

C o
© MathByte Academy 191
y
Shallow Copy e m
a d
c
à but mutating an element will affect both (since it is a shared reference)
A
te
index 0
index 1
obj 1

obj 2
B y index 0
index 1
index 2
th index 2

a
obj 3
original M
(modi0ied)
shallow_copy

t ©
g h
r i
p y
C o
© MathByte Academy 192
y
Creating Shallow Copies e m
a d
à use slicing to slice the entire sequence
A c
à my_list[:]

te
à use the copy method

By à my_list.copy()

my_list = [1, 2, 3] th
a
my_copy = my_list[:]
my_copy = my_list.copy() M my_copy à [1, 2, 3]

t ©
del my_copy[0]
g h my_copy à [2, 3]
r i
p y my_list à [1, 2, 3]

C o
© MathByte Academy 193
y
Mutable Elements e m
a d
my_list = [['a', 'b'], 2, 3]
A c
my_copy = my_list.copy()
te
By
à my_list[0] and my_copy[0] are both referencing the same list ['a', 'b']
th
my_list[0] is my_copy[0] à True a
M
©
so if we modify that element (from either sequence):
t
my_copy[0].append('c')
g h
r i
my_copy[0] à ['a', 'b', 'c']

p y
my_list[0] à ['a', 'b', 'c']

C o
© MathByte Academy 194
y
Creating Deep Copies e m
a d
à uses deepcopy function in the copy module
A c
from copy import deepcopy
te
my_list = [['a', 'b'], 2, 3]
By
my_copy = deepcopy(my_list)
th
a
my_list[0] is my_copy[0] à False M à element has been copied too!

t ©
my_copy[0].append('c')
g h
r i
my_copy[0] à ['a', 'b', 'c']

p y
my_list[0] à ['a', 'b']

C o
© MathByte Academy 195
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 196
y
e m
a d
Unpacking Sequences A c
te
B y
th
a
M
t ©
h
ir g
p y
C o
© MathByte Academy 197
y
consider a sequence e m
data = (1, 2, 3)
a
à this is a tuple with three elements d
A c
we want to assign those values 1, 2 and 3 to some symbols a, b and c resp.

te
à could do it this way: a = data[0]
b = data[1]
By
c = data[2]
th
a
M
but Python has a better way of doing this! unpacking

a, b, c = (1, 2, 3)
t ©
h
Since tuples don't actually need the parentheses in this case, we can write:
g
r i
a, b, c = 1, 2, 3

p y
C o
© MathByte Academy 198
y
e m
à this works with any sequence in general
a d
a, b = [10, 20] a à 10
b à 20
A c
te
a, b, c = 'XYZ' a à 'X'
By
b à 'Y'
th
c à 'Z'
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 199
y
e m
à beware!

a d
à number of elements in sequence on RHS must match number of symbols on LHS

A c
a, b = 1, 2, 3
te
à ValueError (too many values to unpack)
By
th
a, b, c = 1, 2
a
M
à ValueError (not enough values to unpack)

t ©
g h
r i
p y
C o
© MathByte Academy 200
y
Swapping Two Variable Values e m
a d
à this is a common problem
A
given two variables a and b, swap the value of a and b
c
Initial State: a à 10
te
End State: a à 20
b à 20
Byb à 10

th
a
à typical solution uses a temporary variable

M
©
temp = a
a = b
b = temp
h t
r i g
y
à 3 lines of code and an unnecessary variable

p
C o
© MathByte Academy 201
y
Swapping Two Variable Values e m
a d
à can use unpacking to our advantage

A c
à remember: in an assignment, the RHS expression is evaluated completely /irst
à then the assignment takes places
te
a, b = b, a By
th
à RHS is evaluated /irst a
b, a is the tuple 20, 10
M
©
à then the assignment is made a, b = 20, 10

h t
à values of a and b have been swapped!

r i g
p y
C o
© MathByte Academy 202
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 203
y
e m
Strings c a d
A
te
By
th

6
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 204
y
e m
a d
à strings are sequence types
A c
e
à but they are more specialized than generic sequences
t
à they are homogeneous
By
th
à each element is a single character
a
M
à we have additional functionality available

t ©
g h
r i
p y
C o
© MathByte Academy 205
y
e m
a d
Unicode A c
te
By
th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 206
y
In the beginning… e m
a d
c
… there was ASCII (American Standard Code for Information Interchange)
à addressed the problem of a standard for assigning A
te
à numeric codes
y
à to characters
B
à printable and non-printable

th
a
à and encoding the value into binary à using sequences of 7 bits
M
à given a data stream filled with 0's and 1's

t ©
à carve up in 7 bits and decode character

g h
i
à fonts handle displaying the character
r
y
à a bunch of pixels
p
o
à a glyph

© MathByte Academy C 207


y
à supported character set was limited e m
à 128 characters
a d
à 95 printable characters
A c
(a-z, A-Z, 0-9, * / etc)

e
à 33 non-printable characters (control codes, e.g. esc, newline, tab, etc)
t
B y
th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 208
y
à attempts were made to extend the ASCII set e m
a d
à still far too limited
à standard was poorly followed
A c
te
à Unicode was developed
B y
th
à focused on assigning a code to a character (code point)
a
à does not specify how to encode the code points into a binary format
M
à other standards for doing that appeared
à UTF-8
t ©
ß very popular, default in Python
à UTF-16
g h
ri
à UTF-32
y
(utf à Unicode Transformation Format)

p
à > 100,000 code points de/ined so far
o
© MathByte Academy C 209
y
Code Points e m
a d
à backward compatible with ASCII
A c
e
ASCII character code for A à 65 (decimal), 41 (hexadecimal)
t
y
Unicode code point for A à 65 (decimal), 41 (hexadecimal)
B
th
decimal à base 10 (0 – 9)
ahexadecimal à base 16 (0-9, A-F)
M
t ©
g h
r i
p y
C o
© MathByte Academy 210
y
What is hex anyway? e m
a d
Decimal system – uses powers of 10
c
à 10 digits, 0-9
A
e
103 102 101 100

9 0 3 4
t
9034 = 4 ×10! + 3 ×10" + 0 ×10# + 9 ×10$
y
h B
Binary – uses powers of 2
t
à 2 digits, 0-1

a
M
(1011)# = 1×2! + 1×2" + 0×2# + 1×2$ = 11"!
23 22 21 20

©
1 0 1 1

Hexadecimal – uses powers of 16


h t à 16 digits, 0-9, A-F

r i g A à 10, Bà 11, …, F à 15

y
163 162 161 160

>?15 = 5×16! + 1×16" + 12×16# + 15×16$


F C

o
1
p 5
= 64533"!

© MathByte Academy C 211


y
Unicode Character A e m
https://www.compart.com/en/unicode/U+0041
a d
A c
the character (hex) code

te
By
th
a the character name
M
t ©
g h
r i
p y corresponding lowercase letter

C o
© MathByte Academy 212
y
à ord() function e m
a d
c
à returns code point for a single character (in decimal)
ord('A') à 65 A
te
à hex()
By
à converts decimal to hex string
th
a
M
hex(65) à '0x41' (0x pre/ix indicates the number after that is in hex)

t ©
g h
r i
p y
C o
© MathByte Academy 213
y
e m
d
https://www.compart.com/en/unicode/U+03B1

c a
A
te
By
th
a
M
hex(ord("α")) à '0x3b1'
t ©
g h
r i copy/paste the glyph from that page

p y straight into your Python code

C o
© MathByte Academy 214
y
e
Other ways to specify the character in a string
m
a d
à use escape codes
à by hex code à by name A c
te
"\N{Greek Small Letter Alpha}"
B y à "α"

th
a
"The letter \N{Greek Small Letter Alpha} is the first letter of the Greek alphabet."

à 'The
M
letter α is the first letter of the Greek alphabet.'

"\u03b1" "\u03B1"
t © \u must be followed by exactly 4 hex digits (0-F)
h
ir g
"The letter \u03B1 is the first letter of the Greek alphabet."

à'The
p y
letter α is the first letter of the Greek alphabet.'

C o
© MathByte Academy 215
y
e m
d
https://www.compart.com/en/unicode/U+1F40D

c a
A
te
By
th
à ' "' a
M
"\N{snake}"

©
ànote how character code 1F40D has 5 digits
t
g h
à must use \U followed by exactly 8 digits (\u is limited to 4 digits)

"\U0001F40D"
r i à ' "'

p y
o
pad with zeroes to make 8 digits

© MathByte Academy C 216


y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 217
y
e m
a d
Common String Methods A c
te
B y
th
a
M
t ©
h
ir g
p y
C o
© MathByte Academy 218
y
e m
à Python has a ton of string methods
a d
c
https://docs.python.org/3/library/stdtypes.html#string-methods
A
e
In this video we are going to look at some in these categories
t
à case conversions
By
à stripping start and end characters
th
à concatenating strings
a
à splitting and joining strings
M
à finding substrings
t ©
g h
r i
à methods, called using dot notation my_string.method()

p y
à remember that strings are immutable

C o
à operations never modify a string à just return a new string

© MathByte Academy 219


y
Case Mappings e m
a d
lower()
A c
'Hello World'.lower() à 'hello world'

te
upper()
y
'python'.upper() à 'PYTHON'

B
title()
h
'one two three'.title() à 'One Two Three'
t
a
à returns a new string M
t ©
h
à primarily used for visual display
g
r i
y
à BEWARE: may not work for caseless comparisons
p
C o
© MathByte Academy 220
y
Case Folding e m
a d
casefold()
A
à used for caseless comparisonsc
te
s1 = 'hello'
By
s2 = 'HeLlo'
th
a
s1.casefold() == s2.casefold()
M à True

t ©
h
à we'll explore this vs using case mappings in the code section

r i g
p y
C o
© MathByte Academy 221
y
Stripping e m
a d
à trailing commas A c
sometimes we want to remove leading and trailing characters

te
à whitespace around a string

B y
.lstrip()
th
à strips all whitespace on left of string
.rstrip()
a
à strips all whitespace on right of string
.strip()
M
à strips all whitespace on both ends of string

t ©
à can specify what characters to strip

g h
.strip(' ')
r i
à strip space characters from both ends

p y
.lstrip('abc') à strip the characters 'a', 'b', 'c' from left end
à returns a new string

C o
© MathByte Academy 222
y
Concatenation e m
a d
c
combining two or more strings to form a single string is called concatenation
A
te
'Hello' + ' ' + 'World!'
y
à 'Hello World!'
B
th
a
M
t
à again, this creates a new string©
g h
r i
p y
C o
© MathByte Academy 223
y
Splitting Strings e m
a d
à useful for parsing data from a text file
A c
data = '100, 200, 300, 400'
te
ß a string containing comma

y
delimited values
à can easily split this on the comma

h B
data.split(',')
a t
à returns a list of strings M
['100', ' 200', ' 300', ' 400']

t ©
g h note the spaces
r i
p y à we can strip them later

C o
© MathByte Academy 224
y
Joining Strings e m
a d
à this is the opposite of splitting strings
A c
e
suppose we want to join these strings with `, ` characters between each:
t
'a' 'b' 'c' 'd'
By
we could write:
th
a
'a' + ', ' + 'b' + ', ' + 'c' + ', ' + 'd'

à tedious to type out M


t ©
à "hardcoded"
g h
r i
à what if we have a sequence of strings we want to concatenate

p y
à this approach is not general enough

C o
© MathByte Academy 225
y
Joining Strings e m
a d
data = ('ab', 'cd', 'ef')
A c
à data is a sequence of strings

'--'.join(data)
te
à join the strings in data with -- in between
à 'ab--cd--ef'
B y
th
','.join(['10', '20', '30']) à '10,20,30'
a
M
à remember that a string is a sequence of single characters

t ©
g h
i
'='.join('python') à 'p=y=t=h=o=n'

y r
o p
© MathByte Academy C 226
y
Finding Substrings e m
a d
c
à often just want to know if a sequence of characters is contained inside another
A
à use the in operator
te
'x' in 'xyz' à True By
th
'a' in 'xyz' à False
a
'pyt' in 'python'
M
à True
'pyt' in 'Python'
t ©
à False

g h
à tests containment
r i
p y
à but gives no indication of where the substring is

C o
© MathByte Academy 227
y
e m
à slight variation
a d
c
à does the string start (or end) with the speci/ied characters
A
à still a containment test
te
.startswith('…')
By
.endswith('…')

th
a
'python'.startswith('py') à True
M
'python'.startswith('hon') à False
t ©
h
'python'.endswith('py') à False
g
r i
'python'.endswith('hon') à True

p y
C o
© MathByte Academy 228
y
Finding the Index of a Substring e m
a d
c
à used when we need to know the index of the start position of a substring
A
e
data = 'This is a grammatically correct sentence.'
t
y
à at what index does the string 'correct' occur?
B
data.index('correct') thà 24
a
M
©
à what if substring is not found?
t
g h
à Python raises a ValueError

r i
y
à potentially useful once we learn how to handle exceptions
p
C o
© MathByte Academy 229
y
Finding the Index of a Substring e m
a d
à what if we don't want an exception?
A c
à find
te
à returns -1 if substring is not found

By
h
data = 'This is a grammatically correct sentence.'
t
a
M
data.find('correct') à 24

data.find('DOW')
t © à -1

g h
r i
à once we know how to handle exceptions, this method is a bit redundant

p y
à personally, I prefer using index and using exception handling

C o
© MathByte Academy 230
y
Important Note e m
a d
A c
à only interested in whether or not a substring is contained in another string

à use in
te
By
th
à only use index or find when you need to know the index
a
à in is much faster! M
t ©
g h
r i
p y
C o
© MathByte Academy 231
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 232
y
e m
a d
String InterpolationA c
te
B y
th
a
M
t ©
h
ir g
p y
C o
© MathByte Academy 233
y
e m
a d
c
à often we want to build strings that contain values from some variable
A
à can use concatenation
te
à + works with two strings
By
th
à cannot mix string and numeric for example
'test' + 100 à TypeError a
M
'test' + str(100) à 'test100'
t ©
g h
r i
p y
C o
© MathByte Academy 234
y
Example e m
a d
Suppose we have four variables: open_ = 98
high = 100 A c
low = 95
te
close=99
B y
th
a
We want to build a string that looks like this for display purposes:

M
'open: 98, high: 100, low: 95, close: 99'

à using concatenation: t ©
g h
i
'open: ' + str(open_) + ', high: ' + str(high) + ', low: ' + str(low) + ', close:' + str(close)

y r
o p
à tedious and error prone! à in fact there is an error, can you spot it?!

© MathByte Academy C 235


y
String Interpolation e m
a d
à multiple variants à two most common techniques
A c
à the format method
te
à use {} as placeholders in our string
By
th
à pass variables to format method in same order as we want them in the string

a
à number of {} in string and arguments in format should match
M
à format can have more arguments, they'll just be ignored
©
à IndexError exception if not enough arguments
t
g h
r i
y
'open: {}, high: {}, low: {}, close: {}'.format(open_, high, low, close)

p
o
à note how we did not have to convert the values to strings!

C
© MathByte Academy 236
y
f Strings e m
a d
à new to Python 3.6
A c
à prefix the string with f
te
à use {expr} directly inside the string
By
h
à Python evaluates expr and interpolates the result directly inside the string
t
a
f'1 + 1 = {1 + 1}'
M
à '1 + 1 = 2'

value = 3.14
t ©
h
f'pi is approximately {value}' à 'pi is approximately 3.14'
g
r i
y
f'open: {open_}, high: {high}, low: {low}, close: {close}'

p
o
à 'open: 98, high: 100, low: 95, close:99'

C
© MathByte Academy 237
y
f Strings e m
a d
à could use do this as well
open_ = 98 A c
high = 100
te
low = 95
By
h
close=99

a t
M
f'open: {open_}, close: {close}, delta:{close – open_}'

t ©
h
à 'open: 98, close: 99, delta: 1'

g
r i
p y
C o
© MathByte Academy 238
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 239
y
e m
Iteration c a d
A
te
By
th

7
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 240
y
à fundamental aspect of writing programs is repetition e m
à want to repeat the same process (code) multiple times
a d
à how many times?
A c
à known in advance
te
à load /ile with 10,000 rows
By
à process each row
th
à repeat the same process 10,000 times
à deterministic
a
à not known in advance M
t ©
à get commodity tick data
h
à analyze data until ask price falls below some level
g
r i
à then do something else and stop processing
y
à process may repeat 10 times, or 100 times, we don't know in advance
p
C o
à non deterministic

© MathByte Academy 241


y
e m
à this repetition is called iteration
a d
deterministic iteration
A c
e
à we iterate over the elements of some container
t
à e.g. sequences
By
à more generally over objects that are iterable
th
a
à not all iterables are sequences

M
àa bag of marbles is iterable, but it is not a sequence!
à for loop
t ©
g h
i
non-deterministic iteration
r
y
à we iterate while some condition is True
p
o
à while loop

C
© MathByte Academy 242
y
e m
a d
The range Function A c
te
B y
th
a
M
t ©
h
ir g
p y
C o
© MathByte Academy 243
y
The range Object e m
a d
à range object is an iterable object
A c
e
à it serves up integers one by one as they are requested
t
y
à but the full list of integers does not exist all at once in memory
B
à memory efficient
th
à it has a /inite number of integers a
M
©
à we can iterate over that range object

t
g h
à since it exists and has a finite number of integers à deterministic iteration

r i
y
à we can use the range() function to create range objects

p
C o
© MathByte Academy 244
y
The range() Function e m
a d
c
à three flavors depending on how many arguments are specified
A
range(end) (one argument)
te
By
à generates integers from 0 (inclusive) to end (exclusive)

th
a
range(start, end) (two arguments)

M
à generates integers from start (inclusive) to end (exclusive)

range(start, end, step)


t © (three arguments)
h
à generates integers from start (inclusive) to end (exclusive)
g
r i
à in steps of step

p y
o
à should remind you of slicing

C
© MathByte Academy 245
y
Viewing Contents of range Object e m
a d
r = range(5)
A c
print(r) à 'range(5)'
te
à not what we wanted

By
th
a
à can convert range object to a list or tuple

print(tuple(r)) M
à (0, 1, 2, 3, 4)

t ©
h
print(list(r)) à [0, 1, 2, 3, 4]

r i g
p y
C o
© MathByte Academy 246
y
Iteration e m
a d
à range object is iterable
A c
te
à we can use a for loop to iterate over the elements of this iterable

(next lecture) By
th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 247
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 248
y
e m
a d
for Loops A c
te
By
th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 249
y
e m
à for loops are used to iterate over elements of any iterable
a d
A c
à the loop mechanism retrieves elements from the iterable one at a time

te
à the body of the for loop is executed for each element retrieved

By
à the loop terminates when all elements have been iterated

th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 250
y
for x in ['a', 'b']:
e m
y = x + x
note how the body is indented
a d
print(y)
print('done') c
à just like if…else… code blocks
A
e
unindented à not in loop body
1st iteration:
y t
h B
'a' is retrieved and assigned to the symbol x
y is the concatenation of x and x à 'aa'
'aa' is printed to the console
a t
2nd iteration:
M
t ©
'b' is retrieved and assigned to the symbol x
y is the concatenation of x and x à 'bb'

g h
'bb' is printed to the console

r i
p y
3rd iteration: à no more elements à loop terminates
à code after loop executes

C o à 'done'
© MathByte Academy 251
y
Iterating over range Objects e m
a d
à range objects are iterable
A c
for i in range(4):
e
range(4) à 0, 1, 2, 3
t
y
sq = i * i
print(sq)

h B
output: 0
1 a t
4 M
9
t ©
g h
r i
p y
C o
© MathByte Academy 252
y
Loop Bodies (Blocks) e m
a d
à block can contain any valid Python code
à if…else… A c
à another loop (nested loop)
te
for i in range(1, 4):
B y
for j in range(1, i+1):
th i = 1
print(i, j, i*j)
a j in range(1, 1+1)
print('')
1 1 1 M
t © 2 1 2
i = 2
j in range(1, 2+1)

g h 2 2 4

r i i = 3

p y 3 1 3
3 2 6
j in range(1, 3+1)

C o 3 3 9
© MathByte Academy 253
y
data = [10, 20, 30, -10, 40, -5] e m
a d
suppose we want to replace any negative value with 0
A c
te
à we can iterate over the data and test for negative numbers:

B y
for number in data:
th
10
if number < 0:
number = 0 a 20
30
print(number) M 0

t © 40
0

g h
i
à but how do we replace -10 and -5 with 0 ?
r
p y
à easy if we know the index number data[3] = 0
data[5] = 0
o
à but we don't know that!!
C
© MathByte Academy 254
y
The enumerate Function e m
a d
enumerate is a function that
A c
à takes an iterable argument
te
By
à returns a new iterable whose elements are a tuple consisting of:
à the index number of the original element
à the original element itself
th
a
data = [10, 20, 30, -10, 40, -5] M (0, 10)
for t in enumerate(data):
t © (1, 20)
print(t)

g h (2, 30)

r i (3, -10)

p y
à at each iteration t is a tuple (index, element)
à it can be unpacked!
(4,
(5,
40)
-5)

C o
© MathByte Academy 255
y
data = [10, 20, 30, -10, 40, -5] e m
for t in enumerate(data):
a d
index, element = t
if element < 0:
A c
data[index] = 0
te
B y
data à [10, 20, 30, 0, 40, 0]

th
a
à but we can do one better M
à we can unpack in the for clause itself

t ©
h
for index, element in enumerate(data):
g
if element < 0:
r i
y
data[index] = 0

p
C o
© MathByte Academy 256
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 257
y
e m
a d
while Loops A c
te
By
th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 258
y
à different than for e m
a d
à here we want to repeat some code as long as some condition is True

A c
à non-deterministic à we don't necessarily know when condition becomes True
à maybe never!
te
à in/inite loop

B y
while expr:
<code block>
th
a
M
à expr is evaluated at the start of each iteration

t ©
à if it is True, execute <code block>

h
à if it is False, terminate loop immediately
g
r i
y
à may never execute (if expr is False on /irst iteration)
p
o
à may never terminate (if expr never becomes False)

C
© MathByte Academy 259
y
e m
value = 10
a d
while value < 15:
A c
increments value by 1
print(value)
te
value = value + 1

By
output: 10 th
11 a
12 M
13
14
t ©
g h
r i
p y
C o
© MathByte Academy 260
y
value = 100 e m
a d
while value < 15:
print(value)
A c
value = value + 1
te
output: By
h
no output

a t
M
t ©
g h
r i
p y
C o
© MathByte Academy 261
y
value = 10 e m
a d
while value < 15:
print(value) c
decrements value by 1
A
value = value - 1
te
output: By
10
9 th
8 a
7 M
6

t ©
in/inite loop!!

g h
r i
p y
C o
© MathByte Academy 262
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 263
y
e m
a d
continue, break, else A c
te
B y
th
a
M
t ©
h
ir g
p y
C o
© MathByte Academy 264
y
Skipping an Iteration e m
a d
à continue A c
à sometimes we want to skip an iteration, but without terminating the loop

à immediately jumps to the next iteration


te
By
my_list = [1, 2, 3, 100, 4, 5]
th
for i in my_list: a1
if i > 50: M 2
continue
print(i)
t © 3
4
à when i is 100

print('done')
g h 5
à continue is executed

y ri 'done' à loop jumps to next iteration

o p
© MathByte Academy C 265
y
à continue is not used too often e m
à can sometimes make code difficult to read/understand
a d
for i in my_list: A c
if i > 50:
te
continue
print(i) By
print('done')
th
a
à equivalently: M
for i in my_list:

t ©
if i <= 50:
print(i)

g h
print('done')

r i
p y à less code, easier to read/understand

C o
© MathByte Academy 266
y
Early Termination e m
a d
A c
à loops can be exited early (before all elements have been iterated)
à break
te
my_list = [1, 2, 3, 100, 4, 5]
By
th
for i in my_list: 1
a
if i > 50:
break M
2
3
print(i)
t © 'done'
print('done')
g h
r i à when i is 100

p y à break is executed

C o à loop is terminated immediately

© MathByte Academy 267


y
Early Termination e m
a d
à loop terminating early using break
A c
te
y
à sometimes called abnormal or early termination

B
th
a
à sometimes want to execute some code if loop terminated normally

M
à and different code otherwise (early/abnormal termination)

t ©
g h
r i
p y
C o
© MathByte Academy 268
y
Example e m
a d
We are scanning through an iterable,
looking for an element equal to
A c
'Python'
te
If we /ind the value, we want to y
found = False
B
terminate our scan immediately, and
print 'found', otherwise we want th for el in my_list:
to print 'not found' a if el == 'Python':
M found = True

t © print('found')
break

g h
y ri if not found:
print('not found')

o p
© MathByte Academy C 269
y
The else Clause e m
a d
à Python is really confusing here…
A c
for loops can have an else clause
te
y
à but it has nothing to do with the else clause of an if statement
B
th
à the else clause of a for loop executes if and only if no break was encountered
a in my mind I read it as "else if no break"
M
for i in range(5):
<code block 1>
t ©
else: # if no break
g h
r
<code block 2>
i
p y
à <code block 2> executes if loop terminated normally

C o (i.e. no break encountered)

© MathByte Academy 270


y
Back to our Example e m
a d
found = False
A c
for el in my_list:
te
if el == 'Python':
found = True
By
print('found')
thequivalently:
break
a for el in my_list:
if not found: M if el == 'Python':
print('not found')
t © print('found')
break

g h else: # if no break

r i print('not found')

p y
C o
© MathByte Academy 271
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 272
y
e m
Dictionaries c a d
A
te
By
th

8
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 273
y
e m
d
Dictionaries are one of the most important data structures in Python
a
à we don't always see them
A c
à but they're lurking in the shadows! !
te
By
th
à we saw that variables are symbols pointing to objects
a
à some string (variable name) is associated with some object
M
objects are also dictionaries
t ©
h
à properties are symbols associated to some value (object)
g
r i
p y
à methods are names associated to some function
s.upper() l.append()

C o
© MathByte Academy 274
y
e m
a d
à associating two things together is extremely useful
a phone book à associates a number to a name A c
te
DNS à associates a URL with a numeric IP address

By
book index à associates a chunk of text with a page number

th
a
à associative arrays
M
à sometimes called a map
à abstract concept

t ©
à can be implemented in different ways

g h
i
à Dictionaries (or hash maps) are one concrete implementation
r
p y
C o
© MathByte Academy 275
y
e m
a d
c
Associative Arrays and Dictionaries
A
te
By
th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 276
y
Associating Things e m
a d
à ASCII table
c
à associates a numeric value to certain characters
A
A à 65 a à 97
e
space à 32
t
B à 66

b à 98

By
< à 60
@ à 64
Z à 90 z à 122
th…

We could try this: a


M
keys = [' ', '<', '@', 'A', 'B', …, 'Z', 'a', 'b', …, 'z']

t ©
values = [32, 60, 64, 65, 66, …, 90, 97, 98, …, 122]

g h
r i
à to find the numerical value of 'A'
y
à scan keys to /ind index of 'A'
p
o
à lookup the value for that index in the values list (array)

C
© MathByte Academy 277
y
Another Approach e m
a d
à each tuple has two elements A
à (key, value)
c
instead of storing the data in separate lists, use a list of tuples

te
By
items = [('A', 65), … , ('Z', 90), ('a', 97), … , ('z', 122)]

th
a
M
à to find value associated with 'a'

©
à scan items looking at /irst item of tuple until we /ind 'a'

t
à the value we want is the second element of that tuple we just found
h
r i g
à both approaches have one major drawback
y
à must scan an array until we /ind the correct element
p
o
à the longer the array, the longer time this will take (worst case is last element)

C
© MathByte Academy 278
y
Hash Maps (aka Dictionaries) e m
a d
à better implementation is the hash map (or dictionary)
A c
à similar to the last approach we saw
te
y
à but a special mechanism is used to quickly find a key
B
th
à lookup speed is not affected by size of dictionary

a
M
IMPORTANT à keys must be hashable (hence the term hash map)

t ©
à what that means exactly is not important now
à strings are hashable
g h à numerics are hashable
r i
p y
à tuples may be hashable (if all the elements are themselves hashable)
à lists are not hashable (in general, mutable objects are not hashable)

C o
© MathByte Academy 279
y
Python Dictionaries e m
a d
à both value and key are Python objects
A c
à a dictionary is a data structure that associates a value to a key

te
à key must be hashable type (e.g. str, int, bool, float, …) and unique
à value can be any type
By
à type is dict
th
a
M
à it is a collection of key: value pairs
à it is iterable

t ©
à but it is not a sequence type

g h
i
à values are looked up by key, not by index
r
à technically there is no ordering in a dictionary
y
o p
à it is a mutable collection
(we'll come back to this point!)

© MathByte Academy C 280


y
Dictionary Literals e m
a d
à dictionaries can be created using literals
A c
e
d = {'a': 97, 'b': 98, 'A': 65, 'B': 66, 'z': 122, 'Z': 90}
t
By
à we can use a single line, but often we structure it over multiple lines to
make it more readable
th
a
à readability matters!
d = {
'a': 97, M
'b': 98,
t ©
'A': 65,
g h
'B':
'z':
66,
r i
122,
'Z':
p y90

C
}
o
© MathByte Academy 281
y
Looking up values in a Dictionary e m
a d
à use [] just like for sequence types
A c
e
à but instead of an index value we specify the key
t
By
d = {
th
'a': 97,
a
d['a'] à 97
M
'b': 98,
'A': 65,
'B':
'z':
66,
122, t ©
d['Z'] à 90

'Z': 90
g h
}
r i
p y
C o
© MathByte Academy 282
y
Replacing the Value of an existing Key e m
a d
d = {
'symbol': 'AAPL', A c
'date': '2020-03-10',
te
'close': 285
By
h
}

a t
To change the value associated to the key 'close':
M
©
d['close'] = 285.34

h t
à dictionary now looks like this: d = {

r i g 'symbol': 'AAPL',

y
'date': '2020-03-10',

o p }
'close': 285.34

© MathByte Academy C 283


y
Adding a New key:value Pair e m
a d
à simply assign a value to a new key
A
à if key exists, it will be updated as we just saw
c
te
By
à if key does not exist, a new entry is inserted with key and value
(and this explains why keys in a dictionary are necessarily unique!)
d = {
th
'symbol': 'AAPL', a
'date': '2020-03-10',
M
}
'close': 285.34

t © d = {

g h 'symbol': 'AAPL',

r i 'date': '2020-03-10',

p y
d['open'] = 277.14 à
'close': 285.34,
'open': 277.14

C o }
© MathByte Academy 284
y
Deleting a key:value Pair e m
a d
à we can remove key:value pairs from a dict
A c
à use the del keyword
te
d = {
By
'symbol': 'AAPL',
th
a
'date': '2020-03-10',
'close': 285.34,
'open': 277.14 M
} d = {
t ©
'symbol': 'AAPL',
del d['open'] à
g h
'date': '2020-03-10',

r i 'close': 285.34}

p y
C o
© MathByte Academy 285
y
Common Exceptions e m
a d
A c
certain operations on dictionaries can lead to KeyError exceptions

à trying to read a non-existent key


te
à trying to delete a non-existent key
By
th
a
M
trying to use a non-hashable object as a key leads to a TypeError exception

d[[10, 20]] = 100


t ©
g h
i
à TypeError: unhashable type: 'list'
r
p y
à [10, 20] is a list, and lists are not hashable

C o à cannot be used as a key

© MathByte Academy 286


y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 287
y
e m
a d
Iterating Dictionaries A c
te
B y
th
a
M
t ©
h
ir g
p y
C o
© MathByte Academy 288
y
Dictionaries are Iterable e m
a d
à means we can use a for loop to iterate over… what?
A c
keys? values?
e
key:value pairs?
t
B y
à turns out, any of the above
th
a
M
à default iteration is over the dictionary keys

t ©
data = {'a': 1, 'b': 2, 'c': 3}

g h
for k in data:
y ri a
à b
print(k)

o p c

© MathByte Academy C 289


y
Iterating over values e m
a d
à dictionaries have a method called values()
A c
e
à values() returns an iterable containing just the values of the dictionary
t
data = {'a': 1, 'b': 2, 'c': 3} By
th
for v in data.values(): a 1
print(v) M à 2

©
3

h t
r i g
p y
C o
© MathByte Academy 290
y
Iterating over key:value Pairs e m
a d
à dictionaries have a method called items()
A c
à items() returns an iterable containing the keys and values in a tuple
te
data = {'a': 1, 'b': 2, 'c': 3}
By('a', 1)
for t in data.items():
th à ('b', 2)
print(t)
a ('c', 3)
M
à remember unpacking?
t ©
g h a = 1

r i
for k, v in data.items()
à b = 2
y
print(f'{k} = {v}')
c = 3

o p
© MathByte Academy C 291
y
The keys() Method e m
a d
Technically there is also a keys() method
A c
à behaves like values() or items()
te
y
à but it is an iterable over the keys of the dictionary

B
data = {'a': 1, 'b': 2, 'c': 3} th
a
for k in data.keys(): M a

©
à b
print(k)

h t c

i g
à but default iteration is over the keys anyway
r
y
à so keys() is not particularly useful for iteration

p
C o
© MathByte Academy 292
y
Insertion Order e m
a d
à not every iterable has positional order A c
à we saw in sequence types that elements have positional order

te
à we can pull marbles out of a bag, but there is no particular order

By
h
à for a long time Python dictionaries were the same

a t
à a "bag" of key:value pairs that could be looked up by key

M
à iteration order was not guaranteed to be anything speci/ic

à changed in Python 3.6


t ©
g h
à the iteration order re/lects the insertion order
r i
p y
C o
© MathByte Academy 293
y
Insertion Order e m
a d
what does insertion order mean?
A c
d = {'z': 100, 'a': 1, 'b': 2}
te
y
à literal: insertion order is the order in which the key:value pairs are listed out
B
à 'z': 100, 'a': 1, 'b': 2
th
a
à adding a new element
M
d['x'] = 98

©
à 'x': 98 was added last, so right now it's the "last" element
t
h
à 'z': 100, 'a': 1, 'b': 2, 'x': 98
g
r i
y
à but we still cannot retrieve elements by index
p
C o
© MathByte Academy 294
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 295
y
e m
a d
Working With Dictionaries A c
te
B y
th
a
M
t ©
h
ir g
p y
C o
© MathByte Academy 296
y
Membership Testing e m
a d
à can test if a key exists in a dictionary using in
A c
d = {'a': 1, 'b': 2}
te
'a' in d à True
By
'x' in d à False
th
a
M
©
à not in can be used to test if a key is not present
t
g h
r i
p y
C o
© MathByte Academy 297
y
Useful Methods and Functions e m
a d
d.clear() à removes all elements from d
A c
te
d.copy() à creates a shallow copy of d
By
à same as sequences
th
a
à use copy.deepcopy() to create a deep copy

M
len(d)
t ©
à returns number of elements in d

g h
r i
p y
C o
© MathByte Academy 298
y
Other Methods to Create Dictionaries e m
a d
d = {'a': 1, 'b': 2}
A c
e
d = dict(a = 1, b = 2)

y t
à symbols must be valid variable names and will be used, in string
form, as the keys
h B
a t
à can create a dictionary with several keys all initialized to the same value

M
d = dict.fromkeys(['cnt_1', 'cnt_2', 'cnt_3'], 0)

t ©
d à {'cnt_1': 0, 'cnt_2': 0, 'cnt_3': 0}

g h
i
à /irst argument of fromkeys() should be an iterable (list, tuple, string, etc)
r
y
d = dict.fromkeys('abc', 100)
p
o
d à {'a': 100, 'b': 100, 'c': 100}

C
© MathByte Academy 299
y
Creating Empty Dictionaries e m
a d
à often we create dictionaries that start empty
à and get mutated (modi/ied) as our code runs A c
te
à can use a literal d = {}
By
th
a
M
à can use the dict() function d = dict()

t ©
g h
r i
p y
C o
© MathByte Academy 300
y
The get() Method e m
a d
c
à trying to retrieve a non-existent key results in a KeyError exception
A
e
à sometimes we want to have a "default" value if a key does not exist
t
By
à could use if statements and test is key exists using in
à could try to retrieve the key and handle the exception
th
à or, use the get() method
a
get() can take two arguments M
t ©
à the key for which we want the corresponding value

g h
à the default value we want to use if key does not exist

r i
y
à get() can take a single argument, the key
p
o
à default value is None (special object to indicate "nothing")

C
© MathByte Academy 301
y
The get() Method e m
a d
d = {'length': 10, 'width': 20}
A c
te
d.get('length', 0) à 10

By
h
the key exists, so the corresponding value (10) is returned
t
a
d.get('height', 0) à 0
M
©
the key does not exist, so the default (0) is returned
t
g h
i
d.get('height') à None
r
y
the key does not exist, so the default default-value (None) is returned
p
C o
© MathByte Academy 302
y
The get() Method e m
a d
à data in Python is often handled using dictionaries
A c
te
when we work with data we often have missing values
y
sometimes, not only is the value missing, but the key as well
B
th
à using get() allows us to simplify our code to assign a default for missing keys
a
if 'ssn' in person_dict: M
social = person_dict['ssn']
else:
t ©
social = ''
g h
r i
à
p y
social = person_dict.get('ssn', '')

C o
© MathByte Academy 303
y
Merging one Dictionary into Another e m
a d
à the update() method
à takes a single argument: another dictionary A c
te
d1.update(d2)
By
th
the key:value pairs of d2 will be merged into d1
a
M
à keys in d2 not in d1 will be added to d1 (with the value)

©
à keys in d2 that are present in d1 will overwrite the value
t
h
in d1 with that of d2

g
r i
à Important: d1 is mutated

p y
C o
© MathByte Academy 304
y
Merging one Dictionary into Another e m
a d
d1 = {'a': 1, 'b': 2}
A c
d2 = {'b': 30, 'c': 40}
te
By
d1.update(d2)
th
d1 à {'a': 1, 'b': 30, 'c': 40}
a
M
d2.update(d1)
t ©
d2 à {'b': 2, 'c': 40, 'a': 1}

g h
r i
p y
C o
© MathByte Academy 305
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 306
y
e m
Sets c a d
A
te
By
th

9
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 307
y
What are Python sets? e m
a d
à just like a mathematical set
à a collection of elements A c
te
à no ordering to the elements
à each element is unique By
th
à it is an iterable a
M
à but no guarantee on what the iteration order will be

t ©
think of it like a bag of marbles (a collection of marbles)

g h
i
to iterate you reach in the bag and grab a marble (any marble)
r
y
continue doing so until the bag is empty
p
C o à no order guaranteed! à each marble is unique!

© MathByte Academy 308


y
e m
a d
à union A c
Just like mathematical sets, Python supports set operations

te
à intersection
By
à difference
th
a
à membership (is some object an element of a set or not)
M
à containment (subset, strict subset, superset, strict superset)

t ©
g h
r i
à if you're a little rusty on sets, you should brush up before proceeding
with this section
p y
C o
© MathByte Academy 309
y
e m
a d
Python Sets A c
te
By
th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 310
y
e m
think back to keys in a dictionary

a d
à they are unique
à they are iterable A c
te
y
à they have no particular order (well, Python 3.6 maintains insertion order)

B
h
à keys can be added or removed (dictionary is mutable)
t
a
à they are hashable too - but leave that aside for a moment
M
does that remind you of a set?

t ©
à we can think of the keys in a dictionary as a set

g h
r i
à Python's implementation of sets is essentially like a dictionary
y
à but no values, only keys
p
o
à because of this, set elements must be hashable too

C
© MathByte Academy 311
y
Python Sets e m
a d
à type is set
A c
à sets are iterable
te
y
à iteration order is not guaranteed (at least not yet)
B
à set elements must be hashable
th
a
à sets are mutable
M
à sets are not hashable
t ©
h
à a set cannot be an element of another set, or a key in a dictionary

i g
à if you really want nested sets, use frozenset
r
y
à immutable equivalent of sets – those are hashable
p
C o (if all the elements are, themselves, hashable)

© MathByte Academy 312


y
De@ining Sets e m
a d
à literal form {1, 'a', True}
A c
te
à note the {} – just like for dictionaries

By
à but no key:value pairs, just the "keys"

th
à can also use the set() function a
M
©
set([1, 'a', True])

à empty set
h t
à cannot use {}

r i g
à that would be an empty dictionary

p yà set()

C o
© MathByte Academy 313
y
Defining Sets e m
a d
c
à can also make a set from any iterable (of hashable elements)
A
l = [1, 2, 3, 4, 5]
te
s = set(l) s à {1, 2, 3, 4, 5}
By
th
l = [1, 1, 2, 2, 3, 3, 4, 4, 5, 5]
a
s = set(l)
M
s à {1, 2, 3, 4, 5}

t ©
s = set('python')
g h s à {'p', 'y', 't', 'h', 'o', 'n'}

r i
y
s = set('parrot')
p
s à {'p', 'a', 'r', 'o', 't'}

C o
© MathByte Academy 314
y
e m
a d
à use a for loop for iteration
A c
à use in for membership testing
te
By
à len(s) returns the number of elements in the set

th
a
à s.clear() removes all the elements of the set

à s.copy() creates a shallow copy M


t ©
g h
r i
p y
C o
© MathByte Academy 315
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 316
y
e m
a d
Common Set Operations A c
te
B y
th
a
M
t ©
h
ir g
p y
C o
© MathByte Academy 317
y
Disjointedness e m
a d
A c
à two sets are disjoint if they have no elements in common

te
s1.isdisjoint(s2)
By
à True if no common elements exist
th
a
à False if one or more common elements exist

M
©
(two elements a and b are considered the same if a == b is True)
t
g h
r i
p y
C o
© MathByte Academy 318
y
Adding and Removing Elements e m
a d
s = {10, 'b', True}
A c
s.add(4) s à {10, 'b', True, 4}
te
s.add('b')
B
s à {10, 'b', True, 4}y à no duplicates

th in a set

s.remove('b') a
s à {10, True, 4}

s.remove(100) à KeyError M
t ©
s.discard(4)
g h
s à {10, True}

r i
p y
s.discard(100) à no exception s à {10, True}

C o
© MathByte Academy 319
y
Subsets and Supersets e m
a d
s1 < s2 à True if s1 is a strict subset of s2
A c
e
s1 <= s2 à True if s1 is a subset of s2
s1 > s2
y t
à True if s1 is a strict superset of s2

h B
t
s1 >= s2 à True if s1 is a superset of s2

a
{1, 2} < {1, 2, 3} à True M {1, 2} < {1, 2} à False

t
{1, 2} <= {1, 2, 3} à True © {1, 2} <= {1, 2} à True

g h
yri
{1, 2, 3} > {1, 2} à True {1, 2} > {1, 2} à False

o p
{1, 2, 3} >= {1, 2} à True {1, 2} >= {1, 2} à True

© MathByte Academy C 320


y
Unions and Intersections e m
a d
s1 | s2 à returns the union of s1 and s2
A c !! !"
te
By
th
s1 & s2 à returns the intersection of s1 and s2
!! !"
a
s1 = {1, 2, 3} M
s2 = {3, 4, 5}
t ©
g h
i
s1 | s2 à {1, 2, 3, 4, 5}
r
s1 & s2 à {3}
p y (again, set elements are unique)

C o
© MathByte Academy 321
y
Set Difference e m
a d
the difference s1 – s2 of two sets is
all the elements of one set minus the !! A c !"
elements of the other set
te
By
th
a
à caution: set difference is not commutative s1 – s2 ≠ s2 – s1
M
s1 = {1, 2, 3}
s2 = {3, 4, 5}
t ©
g h
s1 - s2 à {1, 2}
r i
s2 - s1 à {4, 5}
p y
C o
© MathByte Academy 322
y
e m
à always keep sets in mind when coding

a d
c
à membership testing with sets is much faster than lists or tuples
A
te
à easy to eliminate duplicate values from a collection

By
à easy to /ind common values between two collections

th
a
à easy to /ind values in one collection but not in another
M
t ©
g h
r i
p y
C o
© MathByte Academy 323
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 324
y
e m
Comprehensionsca d
A
te
B y
th

10
a
M
t ©
h
ir g
p y
C o
© MathByte Academy 325
y
e m
d
à comprehensions are an easy way to create new iterables from other iterables
a
à like using a loop, but easier and more concise syntax
A c
à works well for simple cases
te
à can quickly become unreadable!
By
à readability matters!!
th
Example a
given a list of 2D vectors M
t ©
[(0, 0), (1, 1), (1, 2), (3, 5)]

g h
i
create a new list containing the magnitude of each vector
r
y
à [02 + 02, 12 + 12, 12 + 22, 32 + 52]
p
C o
à [0, 2, 5, 34]

© MathByte Academy 326


y
e m
a d
List Comprehensions A c
te
B y
th
a
M
t ©
h
ir g
p y
C o
© MathByte Academy 327
y
e m
à a comprehension is a way to use one iterable to create another

a d
à more concise than using regular for loops
A c
à use for simple computations
te
y
à comprehensions can quickly become confusing
B
th
à different types of comprehensions
a
à lists M
à dictionaries
t ©
à sets
g h
à generators
r i
p y
C o
© MathByte Academy 328
y
List Comprehensions e m
a d
a list comprehensions is used to generate a list object
A c
Example
te
we start with an iterable of numbers:
B y
num = (1, 2, 3, 4, 5)
th
or num = [1, 2, 3, 4, 5]
a
M
want to create a new list containing the square of each element

sq = [1, 4, 9, 16, 25]


t ©
g h
r i
p y
C o
© MathByte Academy 329
y
e m
à can do this without comprehensions
a d
numbers = (1, 2, 3, 4, 5) c
this is the new list we want
A
to create
sq = []
te
for number in numbers:
By loop through every element
of the numbers iterable
sq.append(number ** 2)
th
a calculate the square of the
M number and append it to

©
the new list

sq à [1, 4, 9, 16, 25) h t


r i g
p y
C o
© MathByte Academy 330
y
e m
à or we can use a comprehension
a d
numbers = (1, 2, 3, 4, 5)
A c
te
[] indicates we are creating a list

B y
sq = [number ** 2 for number in numbers]
th
a
an expression used to
M iteration over existing iterable
calculate each element
of the new list
t © - note how the loop variable is
available in the expression to the

g h right

r i
à in general
p y[expression for item in iterable]

C o
© MathByte Academy 331
y
e m
a d
à comprehensions offer a more concise (and more efficient!) way of
creating one iterable from another
A c
t
à in terms of result, these two things do the same
e
sq = []
By
for number in numbers:
th
sq.append(number ** 2)
a
M
t ©
sq = [number ** 2 for number in numbers]

g h
r i
à comprehensions are actually functions

p y
à builds up and returns the calculated iterable

C o
© MathByte Academy 332
y
what about something like this? e m
a d
given an iterable of integers
A c
e
à generate a new list that only contains the even integers
t
numbers = [1, 2, 3, 4, 5, 6, 7, 8]
By
th
a
à generate evens = [2, 4, 6, 8]
M
t ©
g h
r i
p y
C o
© MathByte Academy 333
y
e m
à can use a "standard" approach
a d
numbers = [1, 2, 3, 4, 5, 6, 7, 8]
evens = [] A c
for number in numbers:
te
if number % 2 == 0:
evens.append(number) By
th
a
à comprehension syntax supports an if clause
M
©
evens = [number for number in numbers if number % 2 == 0]

t
à in general
g h
r i
[expression1 for item in items if expression2]

p y
C o optional – acts like a filter

© MathByte Academy 334


y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 335
y
e m
a d
c
Dictionary/Set Comprehensions
A
te
By
th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 336
y
e m
à similar to list comprehensions
a d
à use {} instead of []
A c
te
y
à remember literals for dictionaries and sets use {}
B
à dictionary elements are pairs
th à key:value
a
M
à set elements are single values

t ©
h
d = {'a': 1, 'b': 2, 'c': 3}
g
r i
y
s = {'a', 'b', 'c'}
p
C o
© MathByte Academy 337
y
Dictionary Comprehension e m
a d
{key: value for item in items if expr}
A c
te
B y
can be any valid Python expression that
calculates some value

th
a
can be any Python expressions that
calculates a valid key
M
t ©
g h
r i
p y
C o
© MathByte Academy 338
y
Example e m
a d
Given two lists one of which contains widget names, the other containing
c
the sales number for each of those widgets – ordered the same
A
e
widgets = ['widget 1', 'widget 2', 'widget 3', 'widget 4']
sales = [10, 5, 15, 0]
y t
h B
à create a dictionary whose keys are the widget names, and the value
t
the number of sales, but only include widgets that had sales.
a
à "traditional" approach M
d = {}
t ©
h
for i in range(len(widgets)):
g
if sales[i] > 0:
r i
y
d[widgets[i]] = sales[i]
p
C o
© MathByte Academy 339
y
Example e m
a d
à or, we can use a comprehension instead
A c
e
widgets = ['widget 1', 'widget 2', 'widget 3', 'widget 4']
sales = [10, 5, 15, 0]
y t
d = {
h B
widgets[i]: sales[i]
for i in range(len(widgets) a t à later we'll see an

M
even easier way to do
if sales[i] > 0) this
}

t ©
h
à compare to "traditional" approach

g
d = {}
r i
for i in range(len(widgets)):

p y
if sales[i] > 0:

C o
d[widgets[i]] = sales[i]

© MathByte Academy 340


y
Set Comprehensions e m
a d
à similar to a dictionary comprehension
A c
à but elements are not key: value pairs
te
à just the "key" portion
B y
th
a
M
{expr1 for item in items if expr2}

t ©
h
can be any valid Python expression that
g
r i calculates some value

p y
C o
© MathByte Academy 341
y
Example e m
a d
squares of just the even integers
A c
Given a list of integers, create a set that contains a unique collection of the

numbers = [1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6]
te
By
th
s = {number ** 2
a
M
for number in numbers
if number % 2 == 0
}
t ©
g h
r i
p y
C o
© MathByte Academy 342
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 343
y
e m
Exceptions c a d
A
te
By
th

11
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 344
y
What are exceptions? e m
a d
A c
à exceptions are special events that happen when something out of the
ordinary happens while our code is running
te
à an exception is generally unexpected behavior
By
th
à but not always
a
M
à it may be something we expect to happen from time to time

t ©
à we can deal with it and continue running our code

g h
i
à so an exception is not necessarily an error
r
p y
à but unhandled exceptions will cause our program to terminate

C o
© MathByte Academy 345
y
Terminology e m
a d
exception
A c
à a special type of object in Python

te
raising
By
à starting an exception event /low

th
a
exception handling
M
à interacting with an exception /low in some manner

t ©
h
unhandled exception à an exception flow that is not handled by our code

ri g à generally results in our program terminating abruptly

p y
C o
© MathByte Academy 346
y
Exception Hierarchy e m
a d
à Python exceptions form a hierarchy
A c
(we'll cover what that means precisely when we look at
Object Oriented Programming - OOP)
te
By
https://docs.python.org/3/library/exceptions.html#exception-hierarchy

th
a
à basically means that exceptions can be classes sub-divided into sub-
exceptions that are more specific
M
t ©
à for example a broad exception might be LookupError

h
à more speci/ically it could be an IndexError or a KeyError
g
r i
à both of these are categorized more broadly as a LookupError
y
à can choose to handle IndexError specifically
p
C o
à or LookupError more broadly

© MathByte Academy 347


y
Exception Hierarchy e m
a d
à it is also a LookupError exception A c
this means that if an exception object is an IndexError exception

te
à and it is also an Exception exception

By
(most exceptions we work with are classi/ied as Exception types)
th
à confused? think of it this way… a
M
©
Say we have these classes of objects:
t
College Person

g h
i
à a Teacher is also Staff Member
r
Student Staff Member

p y Administrator
à a Staff Member is also a College Person

C o Teacher

© MathByte Academy 348


y
Exception Hierarchy e m
à similarly we have an exception hierarchy
a d
Exception
A c

te
y
LookupError

B
IndexError
à can even write custom exception types
KeyError
OSError
th
à later
FileNotFoundError a
NotADirectoryError
M

t ©
h
à so to handle an IndexError, we could choose to

g
r i
à handle IndexError exceptions very speci;ic

p y
à handle LookupError exceptions
à handle Exception exceptions very broad

C o
© MathByte Academy 349
y
Python Built-In Exceptions e m
a d
à Python has many built-in exception types
A c
e
https://docs.python.org/3/library/exceptions.html

y t
Common exceptions include:
h B
à SyntaxError
à ZeroDivisionError a t
à IndexError M
à KeyError
t ©
à ValueError
g h
r i
à TypeError

p y
o
à FileNotFoundError à and many more…

© MathByte Academy C 350


y
EAFP vs LBYL e m
a d
c
à when we think something unexpected may go wrong in our code
A
te
à /igure out if something is going to wrong before we do it
à LBYL y
Look Before You Leap
B
th
a
à just do it, and handle the exception if it occurs

à EAFP M
Easier to Ask Forgiveness than Permission

t ©
generally in Python
g h
à follow EAFP

r i
y
à exception handling

o p
© MathByte Academy C 351
y
Why EAFP? e m
a d
Something that is exceptional should be infrequent
A c
te
à if we are dividing two integers in a loop that repeats 1,000 times

By
à out of every 1,000 times we run, we expect division by zero to occur 5 times

th
LBYL à test that divisor is non-zero 1,000 times
a
M
EAFP à just do it, and handle the division by zero error 5 times

t ©
à often more efficient

g h
i
à also trying to fully determine if something is going to go wrong is
r
wrong
p y
a lot harder to write than just handling things when they do go

C o
© MathByte Academy 352
y
Exception Handling Flow e m
a d
à an exception occurs
A c
à an exception object is created
te
y
à an exception Clow is started

à we do nothing about it
h B
à program terminates
a t
à we intercept the exception /lowM
t ©
à try to handle the exception in some sense, if possible
à then
g h
r i
à resume running program uninterrupted

p y
à or, let the exception resume
à or, start a new exception /low

C o
© MathByte Academy 353
y
e m
a d
Raising ExceptionsA c
te
B y
th
a
M
t ©
h
ir g
p y
C o
© MathByte Academy 354
y
e m
à often we want to start an exception /low ourselves
a d
à called raising an exception
A c
te
y
à an exception object is associated with an exception /low
B
à we create a new exception object
th
à we raise the exception object a
M
©
à doing this is most useful when we create functions
t
à we'll see this later
g h
r i
y
à for now, we'll just learn how to raise an exception

p
C o
© MathByte Academy 355
y
Example e m
a d
c
à create an exception object using one of Python's built-in exceptions
A
ex = ValueError()
te
y
à usually we include a custom exception message
B
th
ex = ValueError('Name must be at least 5 characters long.')

a
M
à we raise the exception, starting an exception flow
raise ex
t ©
g h
r i
à often do both in one step

p y
raise ValueError('custom message')

C o
© MathByte Academy 356
y
e m
a d
à raising an exception ourselves results in the same exception flow that
Python does when it raises some exception

A c
à we can choose to handle the exception
te
By
à if we don't handle the exception, program terminates

th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 357
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 358
y
e m
a d
Handling Exceptions A c
te
B y
th
a
M
t ©
h
ir g
p y
C o
© MathByte Academy 359
y
General Suggestions for Exception Handlinge
m
a d
à too much work A c
à in general we do not want to just handle any exception anywhere in our code

te
B y
à cannot anticipate every point of failure
à it's OK for program to terminate - we can figure out what went

th
wrong and attempt to fix it later – possibly handling that case
specifically
a
M
à if we don't know exactly why or where the problem occurs in our

©
code, there's not much we can do to recover from the exception
t
h
ir g
à we handle exceptions that are raised by small chunks of code
à we try to handle very specific exceptions, not broad ones

p y
à usually handle exceptions that we can do something about

C o
© MathByte Academy 360
y
try…except… e m
a d
inside a try block
A c
à wrap the code we want to implement an exception handler for

e
à we handle possible exception(s), using except blocks (one or more)
t
a = 1
By this gives us the exception
object that was raised – we can
b = 0
try: th assign it to any symbol we want

result = a / b a – here I just chose ex

except ZeroDivisionError as ex: M


result = 0
t ©
print(f'Exception occurred: {ex}') do something to handle
the exception

g h
print(result)

y ri because we handled the exception, the flow


was interrupted and our code continues to

o p run normally

© MathByte Academy C 361


y
Handling and re-raising an exception e m
a d
exception or a different exception A c
à sometimes we want to handle an exception, but then re-raise the same

à often because there's nothing we can do te


By
à sometimes to create a more explicit exception

th
to raise an exception in an except block:
a
M
raise à re-raises the same exception that caused the except block to be entered

t ©
raise SomeException('…') à raises a new exception

g h
r i
p y
C o
© MathByte Academy 362
y
Application e m
a d
c
à one very common case for re-raising exceptions is for error logging
A
te
à we can view the logs after our program has terminated abnormally
try:
By

th
except Exception as ex:
log(ex) a
raise M
t ©
à we intercept a broad range of exceptions by handling Exception

g h
i
à we log the exception somewhere (console, /ile, database, etc)
r
p y
à we re-raise the exception and let something else either handle it
or terminate the program

C o
© MathByte Academy 363
y
à what do I mean "something else handles it"? e m
à we'll see this more when we cover functions
a d
à try…except… can be nested
A c
à usually indirectly

te
à but directly too
try:
try:
B y
th
raise ValueError('something happened')
except ValueError as ex:
a
log(ex)
raise M
except Exception as ex:
t ©
h
print(f'ignoring: {ex}')

g
r i
y
à here our ValueError gets handled twice!

o p
© MathByte Academy C 364
y
Handling Multiple Exception Types e m
a d
à not limited to a single except block
A c
try:
te

except IndexError as ex:
By

th
except ValueError as ex:
a

except Exception as ex: M

t ©
h
à Python will match the exception to the /irst type that matches in
g
i
sequence of except blocks
r
p y
à so write except blocks from most specific to least specific exception types

o
à remember that exception hierarchy we looked at!
C
© MathByte Academy 365
y
The finally Clause e m
a d
A c
à sometimes we want some code to run after a try…except… whether
an exception occurred or not, and whether it was handled or not
à use the finally clause
te
By
try:
th
… a
except ValueError as ex:
M

except IndexError as ex:
t ©

g h
finally:
r i
p y
# always runs no matter what, before exception flow resumes

C o
© MathByte Academy 366
y
Application e m
a d
à useful when we want a piece of code to always run
à whether an exception has occurred or not
A c
à whether the exception was handled or not
te
y
à whether exception was re-raised or a new one raised
B
try:
th
open_database_connection()
a
start_transaction()
write_data() M
commit_transaction()
t ©
h
except WriteException as ex:

g
raise
r i
rollback_transaction()

finally:
p y
o
close_database_connection()

C
© MathByte Academy 367
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 368
y
e m
Iterables and Iterators c a d
A
te
B y
th

12
a
M
t ©
h
ir g
p y
C o
© MathByte Academy 369
y
à an iterable is something that can be iterated over
e m
a d
à i.e. we can take one element, then the next, then the next, until
we've covered all elements
à no speci/ic iteration order is mandated A c
te
y
à obviously a sequence type is iterable (positional ordering)
B
th
à we saw dictionaries can be iterated over (insert order)

a
à but we also so sets: iterable, but no guaranteed order of any kind
M
à general idea behind iteration is then:

t ©
à start somewhere in the collection (at the beginning if that means something)

g h
i
à keep requesting the next element
r
y
à until there's nothing left (exhausted)
p
C o
© MathByte Academy 370
y
à so we have two concepts here e m
a d
à a collection of objects that we can iterate over
à an iterable
A c
te
y
à something that is able to give us the next element when we request it
B
h
à an iterator

a t
M
©
à we are going to look at those in this section
t
g h
r i
p y
C o
© MathByte Academy 371
y
e m
a d
Iterables and Iterators A c
te
B y
th
a
M
t ©
h
ir g
p y
C o
© MathByte Academy 372
y
e m
à an iterable is something that can be iterated over

a d
à but we still need something that can
A c
à give us the next item
te
By
à keep track of what it's given us so far (so it does not give us the same
element twice)
th
a
M
à informs us when there's nothing left for it to give us

t ©
à this is called an iterator

g h
i
à used by Python to iterate over an iterable
r
p y
C o
© MathByte Academy 373
y
à an iterable is just a collection of objects e m
à it doesn't know anything about how to iterate
a d
A c
à however it knows how to create and give us an iterator when we need it

te
By
à iterables implement a special method __iter__() that returns a new iterator
à can also be called by using the iter() function

th
a
à the iterator has a special method called __next__() that can be called
to get the next element
M
à can also use the next() function

t ©
h
à it keeps track of what it has already handed out

g
r i (so iterators are kind of one time use!)

p y
à it raises a StopIteration exception when next() is called if
there's nothing left

C o
© MathByte Academy 374
y
The Internal Mechanics of a for Loop e m
a d
doing this:
A c
When we write a for loop that iterates over an iterable, what Python is actually

te
l = [1, 2, 3, 4, 5]
By
iterator = iter(l)
th
try:
a
while True:
M
# return next(iterator) – here we'll just print it
print(next(iterator))
t ©
except StopIteration:
g h
i
# expected when we reach the end
r
y
# so silence this exception
pass
o p
© MathByte Academy C 375
y
e m
d
à the key thing here is that we can see the iterator has some state

a
à it has a __next__() method

A c
à but there's no going back, or starting from the beginning again
à to do that we have to request a new iterator
te
By
à and that's what a for loop does – it requests a new iterator from the iterable
before it starts looping
th
a
M
à objects such as lists, tuples, string, dictionaries, sets, range objects are iterables

t ©
à but some objects in Python are iterators – not iterables
à iterators actually implement an __iter__ method

g h
i
à but they just return themselves (with their current state), not a new iterator
r
y
à they allows us to iterate over them
p
o
à but only once

C
© MathByte Academy 376
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 377
y
e m
a d
Generators A c
te
By
th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 378
y
e m
à we've seen list, dictionary and set comprehensions
a d
à but no tuple comprehensions…
A c
result = [i ** 2 for i in range(5)]
te
By
result = []
th
a
for i in range(5):
result.append(i ** 2) M
t ©
h
à works because list is mutable
g
r i
à tuples are not mutable

p y
C oà no tuple comprehension

© MathByte Academy 379


y
so what does this (valid) expression do? e m
a d
c
(i ** 2 for i in range(5))

A
à creates a generator object
te
à generators are iterators
By
à next()

th
à they calculate and hand out elements one at a time as requested
a
M
à unlike [i ** 2 for i in range(5)]

©
à calculates all the elements and creates the list immediately
t
g h
i
à generators use lazy iteration
r
y
à a lazy property is one that is not calculated until it is requested
p
C o
© MathByte Academy 380
y
Why use generators? e m
a d
à memory ef/iciency
A c
à e.g. take all the rows from a /ile, and write them out, transformed
to some other /ile
te
By
à read the entire file in memory, iterate through that and save rows
th
à entire file in memory!
a
M
à you may not have enough memory!

©
à read lines one at a time from file
t
h
à read a row, process it, save it, discard it, request next row, …
g
r i
à only one line in memory at any point

p y
C o
© MathByte Academy 381
y
Why use generators? e m
a d
à performance (possibly)
A c
e
à if you only need to read the /irst few elements of the iterable
t
y
à why go through the computations to calculate all of them?
B
th
à plus unnecessary memory usage on top of that
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 382
y
What's the downside of generators? e m
a d
à generators are lazy iterators
A c
à one-time use
te
By
th
à not good if you need to iterate through the same iterable many times

a
M
à or even just a few times if the calculations are computationally
expensive or take a long time (maybe IO bound)

t ©
g h
r i
p y
C o
© MathByte Academy 383
y
Creating Generators e m
a d
à use generator comprehension
A c
te
à use the yield keyword in functions instead of return
à beyond scope of this course
By
th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 384
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 385
y
e m
Functions c a d
A
te
By
th

13
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 386
y
e m
à we have used functions a lot so far
a d
print()
A c
iter()
te
next()
By
th
list()
a
math.sqrt() M
and many more…
t ©
g h
r i
y
à we can create our own, custom, functions

p
C o
© MathByte Academy 387
y
Why? e m
a d
à easy code re-use
A c
à much easier to code the sqrt() function once
te
à and then call it multiple times
By
th
a
à breaking up complex code into easier to understand chunks

à problem decomposition M
t ©
g h
r i
p y
C o
© MathByte Academy 388
y
e m
à when we create a function, we may also want values to be passed into
it when it is called
a d
à arguments or parameters
A c
te
à technically not the same thing, but almost
everyone uses them interchangeably
By à as do I L

th
when we define a function we may
a
define symbols for the values that
will be passed to the function M à these symbols are called parameters

t ©
g h
when we call a function we specify
à these values are called arguments
i
values for these parameters
r
y
à so a parameter is when we de/ine the function
p
o
à an argument is when we call the function 389

© MathByte Academy C
y
Functions are Python Objects e m
a d
à just like everything in Python, functions are objects
A c
à they have state
te
à name (maybe!)
By
à code
th
à parameters
a
à they are callable M
à and always return something when called

t ©
h
à they can be assigned to a symbol

g
r i
à can be passed as a parameter to another function

p y
à can be returned from a function call

C o
© MathByte Academy 390
y
Callables e m
a d
à an object is callable if it can be called – using ()
A c
à functions are run by calling them
e
print('hello')
t
y
math.sqrt(4)
B
th
à but other types of objects are also callable
a
M
à not necessarily a function object

my_list.copy()
t ©
à calling a method on the my_list object
range(100)
g h à creating a new range object

r i
y
à more general term is a callable

p
C o
© MathByte Academy 391
y
e m
a d
Custom Functions A c
te
B y
th
a
M
t ©
h
ir g
p y
C o
© MathByte Academy 392
y
à functions can be de/ined using the def keyword e m
a d
A c
def keyword indicates a function is being defined
function's name can be any valid

t
Python name
e (just like variables)
def function_name():
# indented block
B y

th
this block is called the function body
return <value>
a
M functions always return some value

t ©
à function body contains any valid Python code

h
à this creates a function object
g
r i
à the function object is associated with the symbol function_name

p y
(in the same way a = 10 associates the integer object 10 with the symbol a)

C o
© MathByte Academy 393
y
Example e m
a d
def say_hello():
print('Hello!')
A c
te
say_hello() à Hello!
By
say_hello() à Hello!
th
a
but no return?
M
t ©
à if a return value is not speci/ied, function will return None

g h
r i
p y
C o
© MathByte Academy 394
y
Example e m
a d
def one():
return 1
A c
function returns the value 1 when it is called

te
result = one()
B y
à assigns the return value of calling one() to the
symbol result
th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 395
y
à usually functions contain a little more complex code e m
a d
from datetime import datetime
A c
te
def current_time_utc():
return datetime.utcnow().isoformat()
By
th
a
M
result = current_time_utc()

©
result à "2020-03-31T02:44:38.490923"
t
g h
r i
p y
C o
© MathByte Academy 396
y
e m
à functions are usually more helpful when we can pass values to them
len(my_iter)
a d
A c
we are passing an argument to the len function
te
y
à every time we call the len function we can pass a different value
B
th
à the function body (implementation) of the len function starts running

a
à it is aware of the value that was passed to it
M
©
à same with custom functions

h t
à need to specify the parameters, by name, that will be used when we call it

ri
def add(a, b): g add(2, 3) à 5

p y
return a + b add(10, 1) à 11

C o
© MathByte Academy 397
y
e m
d
def subtract(a, b):
return a - b
c a
A
subtract(10, 7) à 3
te
y
à when we call subtract(10, 7), how does Python assign 10 to
B
the symbol a, and 7 to b?
th
a
M
à it does this by position

t ©
def my_func(a, b, c, d):

g h my_func(10, 20, 30, 40)

i

y r
p
à positional arguments
o
© MathByte Academy C 398
y
Namespaces e m
a d
à when a function is called
à it knows nothing about how it was called before A c
te
à every time a function is called
By
à an empty dictionary is created
th
a
à populated with any arguments passed in

M
à key = param name, value=argument

©
à nothing else
t
h
à then the function code runs
g
r i
à this dictionary is called the (local) namespace

p y
C o
© MathByte Academy 399
y
e m
abs_max(1, -2)
a d
def abs_max(a, b):
abs_a = abs(a) A c
{'a': 1, 'b': -2}

e
{'a': 1, 'b': -2, 'abs_a': 1}
abs_b = abs(b)
if abs_a > abs_b: t
{'a': 1, 'b': -2, 'abs_a': 1, 'abs_b': 2}
y
max_val = abs_a
h B
else:
max_val = abs_b a t
{'a': 1, 'b': -2, 'abs_a': 1, 'abs_b': 2,

M
'max_val': 2}
return max_val

t ©
à after function return, dictionary is wiped out

g h
r i
à consecutive calls to the same function are independent of each other

p y
C o
© MathByte Academy 400
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 401
y
e m
a d
* Arguments A c
te
By
th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 402
y
e m
à saw how to specify positional parameters in a function
a d
def average(a, b, c, d):
return (a + b + c + d)/4 A c
te
By
à but what if we wanted to specify an arbitrary number of parameters?

th
a
à we'd like to call our function with different number of args

M
average(1)

t ©
h
average(1, 2, 3)

i g
average(1, 2, 3, 4)
r
p y
C o
© MathByte Academy 403
y
e m
d
à could write a function to use an iterable as a single argument
a
def average(iterable):
return sum(iterable) / len(iterable)
A c
te
y
à but this makes the calling syntax a little weird

B
average([1, 2, 3])
th
average([1]) a
M
©
à would be nicer if we had a mechanism to accept a variable number of args
t
g h
r i
p y
C o
© MathByte Academy 404
y
e m
à Python supports a special parameter type for this
a d
à uses a * prefix on a parameter name
A c
def average(*values):
te
# return average
By
h
à this means we can call average with any number of arguments
t
average(1)
a
average(1, 2, 3, 4, 5)
M
t ©
àhow do we access these values inside the function

g h
i
à use the parameter name à values in this case

y r
à it will be a tuple containing all the argument values

o p
© MathByte Academy C 405
y
e m
def average(*values):
print(type(values))
a d
print(values)
A c
te
average(1, 2, 3)
y
à values will be a tuple
B
à (1, 2, 3)
th
a
def average(*values): M
t ©
return sum(values) / len(values)

g h
i
à we may want to do something if someone calls this
r
y
function with no arguments

p
C o
© MathByte Academy 406
y
e m
à often you will see code that uses *args
a d
à the * is the important part
A c
à there is nothing special about the name args
te
à as we just saw, we can use any valid name
B y
à use a meaningful name
th
à args is often too generic

a
M
t ©
g h
r i
p y
C o
© MathByte Academy 407
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 408
y
e m
a d
Default Values A c
te
By
th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 409
y
e m
à possible to specify optional parameters
a d
A c
à means function can be called without passing in the argument

te
à but we still have that parameter

By
à it needs a value
th
a
à we can specify a default value to use if the argument is not supplied
M
t ©
g h
r i
p y
C o
© MathByte Academy 410
y
e m
d
def func(a=1):
print(a)
c a
a default value to use if a is not supplied

A
when function is called

te
func() à 1
B y
th
a
func(10) à 10

M
à once you specify a positional parameter with a default value

t ©
h
à all positional parameters after that must specify a default value too
g
r i
à with the exception of a starred parameter

p y
C o
© MathByte Academy 411
y
e m
à how would you interpret this?

a d
def func(a=1, b):
pass A c
te
func(10)
By
th
a
à is 10 supposed to go into a? à in which case we're short one argument

M
à or use default for a and assign 10 to b?

à don't know!
t ©
g h
r i
p y
C o
© MathByte Academy 412
y
e m
a d
à so once we have default arguments we need to specify default for all
parameters after it
A c
def func(a, b, c=1, d=2):
te

By
th
but this is still ok:
a
def func(a, b=1, *args)

func(10) M
a à 10 b à 1 args à (,)

func(10, 2) t ©
a à 10 b à 2 args à (,)

g h
func(10, 2, 3, 4)
y ri a à 10 b à 2 args à (3, 4)

o p
© MathByte Academy C 413
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 414
y
e m
a d
Keyword-Only Arguments A c
te
B y
th
a
M
t ©
h
ir g
p y
C o
© MathByte Academy 415
y
e m
à we saw how positional parameters can be passed
a d
à positionally
à as a named argument A c
e
à also called a keyword argument

y t
def func(a, b, c):
h B

a t
M
©
func(1, 2, 3)

func(c=3, b=2, a=1)


h t
r i g
y
à passing argument as a keyword argument is optional
p
C o
© MathByte Academy 416
y
e m
à can also make passing an argument by name mandatory
a d
à these are called keyword-only arguments
A c
te
y
à keyword-only parameters must come after all positional parameters
B
th
def func(a, b, c)
a
M
à we want c to always be passed as a named argument

t ©
h
à somehow we have to tell Python that after a and b there are no

g
i
more positional arguments

y r
o p
© MathByte Academy C 417
y
e m
à one way is to use a * parameter
a d
def func(a, b, *args, c)
A c
te
à since *args will scoop up every remaining positional argument
à c must be a keyword-only argument
By
th
func(10, 20, 30, c=100) a
a à 10 b à 20 M args à (30, ) c à 100

t ©
func(10, 20, c=100)
g h
r i
a à 10

p y b à 20 args à (, ) c à 100

C o
© MathByte Academy 418
y
e m
d
à but this allows someone to pass in as many positional arguments as they want
a
à what if we don't want that?
A c
want to make this allowed:
te
func(10, 20, c=100)
but not this:
By
func(10, 20, 30, 40, c=100)

th
a
à we still have to tell Python that there are no more positional arguments

à we use a * without a parameter name M


t ©
g h
r i
p y
C o
© MathByte Academy 419
y
e m
def func(a, b, *, c):
a d

A c
à a and b are positional parameters
te
y
à there are no more positional parameters after that
B
à so c is a keyword-only argument
th
a
func(10, 20, c=100) M
t ©
func(10, 20, 30, c=100)
g h
r i
p y
C o
© MathByte Academy 420
y
e m
d
def func(a, b, *, c):

c a
A
à using this technique c must be passed as a named argument

te
y
à a and b can be passed as positional arguments
à or as named arguments
h B
a t
func(b=2, a=1, c=3)
M
func(a, c=3, b=2)
t ©
g h
r i
p y
C o
© MathByte Academy 421
y
Default Values e m
a d
c
à can also assign default values to keyword-only arguments
A
def func(a, b, *, c=100):
te

B y
th
à c is optional, and will default to 100
a
M
à if c is passed, it must still be passed as a named argument

t ©
func(10, 20)

g h c à 100

y ri
func(10, 20, c=30) c à 30

o p
à can mix default values for both positional and keyword-only arguments

© MathByte Academy C 422


y
Arbitrary Number of Keyword-only Parameters e m
a d
c
à saw * for arbitrary number of positional arguments
A
te
à use ** for arbitrary number of keyword-only arguments

B y
h
func(a, b, *args, c, d, **kwargs)
t
à a and b are positional a
M
à c and d are keyword-only

t ©
à extra positional arguments are scooped up into args
h
ir g
à extra named arguments are scooped up into kwargs

p y
C o
© MathByte Academy 423
y
e m
à ** keyword-only arguments are scooped up into a dictionary
a d
à key is the argument name
à value is the argument value A c
te
def func(a, *, d, **others):
By à others is a dict

th
a a à 10
func(10, d=2, x=10, y=20) M
©
d à 2
func(a=10, d=2, x=10, y=20)
h t others à {

r i g
func(x=10, y=20, d=2, a=10)
'x': 10,
'y': 20

p y }

C o
© MathByte Academy 424
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 425
y
e m
a d
Lambda Functions A c
te
B y
th
a
M
t ©
h
ir g
p y
C o
© MathByte Academy 426
y
e m
à lambda functions are just functions
a d
A
à they are not de/ined using a def and block of code c
te
à it is an expression that returns a function object

By
à Python does not create a symbol or a name for the function
th
a
à just returns the function object
M
©
à we can assign it to a variable or pass it as an argument
t
g h
à also called anonymous functions

r i
y
à they are very simple functions (no code block)
p
C o
© MathByte Academy 427
y
e m
lambda a, b: a + b
a d
function parameters
A c
what the function should return

te
à must be a single expression

B y
à no code block

th
à so no loops, try…except…, if…else…, etc
a
M
à this expression returns a function object

t ©
h
à we need to assign it to a symbol if we want to use it
g
r i
f = lambda a, b: a + b

p y
f(10, 20) à 30

C o
© MathByte Academy 428
y
e m
d
à can always use a function de/ined using def instead of these lambdas
a
A c
à generally used to write shorter code in some simple cases

t
à we'll see example of this in the next sectionse
à but you don't have to use them By
th
a
M
à however they do get used often, so you should be aware of them

t ©
g h
r i
p y
C o
© MathByte Academy 429
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 430
y
e m
Some Built-In c a d
A
Functions yte
h B
t

14
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 431
y
e m
d
à in this section we are going to look at some more of Python's built-in functions

à there are many more! c a


A
te
à https://docs.python.org/3/library/functions.html

By
th
à and that does not even include the thousands of functions available
in Python's standard library
a
M
à we'll study some of them later in this course

t ©
à math and stats

g h
à time and datetime
r i
p y à csv
à random and more…

C o
© MathByte Academy 432
y
e m
a d
Rounding A c
te
By
th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 433
y
e m
d
à round() is a built-in function that can be used to round floats
a
à uses banker's rounding
A c
à also called round half to even
te
By
à rounds away from zero
h
1.8 à 2
t
a-1.8 à -2

M
t ©
à ties round to closest even digit 1.5 à 2

g h
i
2.5 à 2

y r
p
à good choice to eliminate various biases
o
© MathByte Academy C 434
y
e m
à use round() to round to an integer
a d
round(1.8) à2 A c
te
round(-1.8) à -2
By
th
round(1.5) à2
a
M
round(2.5) à2

t ©
g h
r i
p y
C o
© MathByte Academy 435
y
1 e m
à can also use round() to round to closest multiple of
10
a d
round(value, exponent)
A c
exponent is used to specify what power of
1
te
to round to

B y
10

th
à let's look at it mathematically first (i.e. without worrying about float representations)

a
round(x, 1) M
à rounds to nearest 0.1 (10%" )

round(x, 2)
t ©
à rounds to nearest 0.01 (10%# )

g h
round(x, -1)
r i à rounds to nearest 10 (10" )

p
round(x, -2) y à rounds to nearest 100 (10# )

C o
© MathByte Academy 436
y
e m
round to closest
multiple of: a d
round(127.1892, 3) 10-3 à 0.001 A c
à 127.189

te
round(127.1892, 2) 10-2
y
à 0.01
B
à 127.19

round(127.1892, 1) 10-1
th
à 0.1 à 127.2
a
round(127.1892, 0)
M
100 à 1 à 127.0

round(127.1892, -1)
t © 101 à 10 à 130.0

g h
yri
round(127.1892, -2) 102 à 100 à 100.0

o p
round(127.1892, -3) 103 à 1000 à 0.0

© MathByte Academy C 437


y
Rounding Ties in Floats e m
a d
c
à technically rounds to closest number that ends with an even digit
A
round(0.125, 2) à 0.12
te
à so why this?
By
th
round(0.325, 2) à 0.33
a why not 0.32?

M
©
à remember /loats do not have (in general) an exact representation!

t
h
0.325 à 0.325000000000000011102230246252
g
r i
à so this is not a tie!

p y
C o
© MathByte Academy 438
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 439
y
e m
a d
sorted, min and max A c
te
B y
th
a
M
t ©
h
ir g
p y
C o
© MathByte Academy 440
y
Sorting Numbers e m
a d
à numbers have a natural sort order
A c
à they can be sorted ascending or descending by that sort order
te
B y
à sorted is a built-in function that can be used to sort a collection of numbers
à single positional argument: an iterable containing the numbers
th
a
à by default, it sorts in ascending order

M
à keyword-only argument to reverse the sort order
à default is False
t © à sorts ascending

g h
à specify reverse=True à sorts descending

r i
à always returns a new list

p y
à original iterable is not mutated

C o
© MathByte Academy 441
y
e m
t = (1, 10, 2, 9, 3, 8)
a d
sorted(t)
A c
te
y
à [1, 2, 3, 8, 9, 10]

h B
sorted(t, reverse=True)
a t
à [10, 9, 8, 3, 2, 1] M
t ©
g h
r i
p y
C o
© MathByte Academy 442
y
Sorting Strings e m
a d
Numbers have a natural sort order
A c
Strings also have a natural sort order in Python
te
à lexicographic order
By
t
à dictionary order, alphabetical order
h
a
M
BEWARE The characters a and A are not the same

t ©
à Python assigns a numerical character code (the unicode character

g h
code) to each character in a string

r i
A à 65
p y
Z à 90
z à 122
à'A' < 'Z' < 'a' < 'z'

o
a à 97

© MathByte Academy C 443


y
e m
d
à so Python will use "alphabetical" sorting, but upper case letters will be
a
c
sorted before their equivalent lower case versions

à natural sort order of string is case sensitive A


te
sorted(['Boy', 'baby'])
B y
th
à ['Boy', 'baby'] a(ascending sort)
M
t ©
g h
r i
p y
C o
© MathByte Academy 444
y
Sorting Other Types e m
a d
à we can "visually" sort other types of objects
A c
à list of Persons
te
By
à we can sort this list
th
à by name
a
à by age M
à by profession
t ©
g h
r i
à we always sort by some property of the objects we are sorting

p y
à we'll come back to this in a later section

C o
© MathByte Academy 445
y
min and max e m
a d
à closely related to sorting
A c
à to find the minimum of a collection
te
By
à sort the collection (by something) in ascending order
à pick the first element
th
a
M
à to find the maximum of a collection

t ©
à sort the collection (by something) in descending order

g h
à pick the /irst element
r i
p y
(or you could sort in the other direction in both cases and pick the last element)

C o
© MathByte Academy 446
y
e m
a d
min([1, 10, 2, 9, 8]) à 1
A c
max([1, 10, 2, 9, 8]) à 10
te
By
min([]) à ValueError exception
th
a
M
à can specify a default value to return if the iterable is empty
à keyword-only argument
t ©
g h
i
min([], default=0) à 0
r
p y
C o
© MathByte Academy 447
y
e m
d
à can also use an arbitrary number of positional arguments instead
a
min(1, 10, 2, 9, 3, 8) à 1 A c
te
max(1, 10, 2, 9, 3, 8) à 10
By
th
a
M
this course
t ©
à we'll come back to min and max when we look at sorting again later in

g h
r i
p y
C o
© MathByte Academy 448
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 449
y
e m
a d
The zip() Function A c
te
B y
th
a
M
t ©
h
ir g
p y
C o
© MathByte Academy 450
y
e m
à the zip() function is a very useful and often used function
a d
A c
à consider these two lists that contain related information

te
l1 = ['a', 'b', 'c', 'd', 'e', 'f']
l2 = [97, 98, 99, 100, 101, 102] By
th
a
à we want to create a list of tuples that contain the corresponding
elements from l1 and l2
M
t ©
g h
r i
p y
C o
© MathByte Academy 451
y
à could do this: e m
a d
c
combo = [(l1[i], l2[i]) for i in range(len(l1))]

combo à [('a', 97), ('b', 98), ('c', 99), A


('d', 100), ('e', 101), ('f', 102)]
te
By
but, we may have an issue if the two lists are not of the same length
th
a
àhave to stop at the shortest of the two lengths

l1 = ['a', 'b', 'c', 'd', 'e'] M


l2 = [97, 98, 99]

t ©
h
combo = [(l1[i], l2[i]) for i in range(min(len(l1), len(l2)))]
g
r i
y
combo à [('a', 97), ('b', 98), ('c', 99)]

p
C o
© MathByte Academy 452
y
e m
à that's what the zip() function does!
a d
l1 = ['a', 'b', 'c', 'd', 'e']
A c
l2 = [97, 98, 99]
te
combo = zip(l1, l2)
By
th
BEWARE zip() returns an iterator
a
à remember those? M
à can only iterate through them once

t ©
h
list(combo) à [('a', 97), ('b', 98), ('c', 99)]
g
r
list(combo) à []i
p y
C o
© MathByte Academy 453
y
e m
d
à if you want to iterate multiple times over the same zipped collection
a
à store it into a list
A c
combo = list(zip(l1, l2))
te
à often don't need to
By
th
zip() does not actually create anything other than an iterator
a
à no physical space has been used for the tuples
M
©
à iterating over zip() result, just iterates over the iterables simultaneously
t
g h
i
à costs almost nothing calling zip(l1, l2) multiple times
r
p y
C o
© MathByte Academy 454
y
e m
à zip is extensible
a d
à not limited to two iterables
A c
à any number of iterables (positional args)
te
By
h
l1 = [1, 2, 3]
l2 = [1, 2, 3, 4, 5]
l3 = [1, 2, 3, 4, 5, 6, 7] a t
M
zip(l1, l2, l3)
t
à
© (1, 1, 1)
(2, 2, 2)

g h (3, 3, 3)

r i
y
à always returns an iterator that produces tuples

o p
© MathByte Academy C 455
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 456
y
e m
Higher Order c a d
A
Functions yte
h B
t

15
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 457
y
e m
in Python any object can:
a d
A c
à be passed to a function as an argument (or callable in general)

te
à be returned from a function (or callable in general)

By
à functions are objects
th
a
à functions can be passed to and/or returned from functions
M
©
à these are called higher order functions
t
h
it's a math concept too – often referred to as operators or functionals

g
r i
(functions that do not allow passing a function to or returning a

y
function are called /irst order functions)
p
C o
© MathByte Academy 458
y
e m
amongst other things:
a d
A c
à a function de/inition can itself contain another function de/inition
à and can return it
te
By
th
This means we can call a function that builds another function and runs
it, or even returns it
a
M
©
à what becomes interesting is that variables in the outer function

t
become available to the inner function
h
r i g
p y
C o
© MathByte Academy 459
y
e m
def say_hello(first_name, last_name):
a d
def assemble_name():
return ' '.join([first_name, last_name])
A c
te
y
return ' '.join(['Hello, ', assemble_name(), '!'])

B
th
a
M
say_hello('Eric', 'Idle') à Hello, Eric Idle

t ©
g h
à we are going to study this in this chapter, along with higher order
functions
r i
p y
à in subsequent chapters we'll look at some important fundamental applications

C o
© MathByte Academy 460
y
e m
a d
c
Passing and Returning Functions
A
te
By
th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 461
y
Passing Functions as Arguments e m
a d
à function arguments can be functions
à the object is passed, not called
A c
te
à so don't use () to pass a function, that would pass the result of the function!

B y
def add(a, b):
th
this argument is going to receive a function object
return a + b
a
M
©
def apply(func, a, b): we now call func
result = func(a, b)
return result
h t which is whatever function was passed in

r i g
apply(add, 2, 3)
p y à pass the add function to apply

C o à5

© MathByte Academy 462


y
Nested Functions e m
a d
à function bodies can contain any valid Python code
A c
à including defining functions
te
By we are creating a new
def say_hello(name):
def prefix(): th function prefix

return 'Hello, ' a


M calling prefix()
msg = prefix() + name
return msg
t ©
g h
r i
p y
C o
© MathByte Academy 463
y
Returning Functions e m
a d
à a function can also return a function
c
passing in a function
A
def identity(func):
te
returning the same function
return func

B y
def add(a, b):
th
return a + b a
M
f is now a symbol pointing to add
f = identity(add)

t ©
f(2, 3) à 5
g h
r i
y
à silly example!
p
C o
© MathByte Academy 464
y
Returning Functions e m
a d
à often we return a nested function
A c
def generate_func(name):
te
y
def add(a, b):

B
return a + b

th
a
def mult(a, b):
return a * b f = generate_func('sum')
M f(2, 3) à 5
if name == 'sum':
return add
t ©
else:
g h à still a silly example!

y ri
return mult à we'll see real examples soon!

o p
© MathByte Academy C 465
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 466
y
e m
a d
The map() Function A c
te
B y
th
a
M
t ©
h
ir g
p y
C o
© MathByte Academy 467
y
e m
d
à the map() function calls a specified function for every element of some iterable
a
à very similar to doing something like this:
A c
te
def my_map(func, iterable):
By
result = [func(element) for element in iterable]
return result
th
a
M
à here we are creating a list that contains the function func applied
to every element of iterable
t ©
h
à but it creates a list
g
r i
à can take a lot of space if iterable is large

p y
à especially wasteful if we don't iterate over all the values

C o
© MathByte Academy 468
y
e m
à map() returns an iterator

a d
iterator = map(func, iterable)
as we iterate over that iterator: A c
te
à Python moves to the next item in iterable

By
à calls func(element)
th
à returns the result a
à less wasted space M
t ©
à saves computations if we don't iterate over the whole list

g h
r i
à equivalently we could also just use a generator expression

p y
(func(el) for el in iterable)

C o
© MathByte Academy 469
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 470
y
e m
a d
Closures A c
te
By
th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 471
y
e m
a d
in the last videos we saw that function de/initions can be nested within
another function
A c
def outer():
te
def inner():

By
th
a
and we saw that we can return the inner function from the outer function
M
def outer():
t ©
def inner():

g h
r i
p
return inner y
C o
© MathByte Academy 472
y
e m
d
but we can create variables in the outer function also, or pass arguments
when we call it
c a
def outer(a, b):
A
c = 100
te
def inner():
By

th
a
return inner
M
à inner can "see" those variables
t ©
g h
à it even retains these values when it is returned

r i
y
à the inner function can "capture" those variables

p
C o
à this is called a closure

© MathByte Academy 473


y
e m
def outer(a):
def inner():
a d
return a * 10
A c
return inner
te
By
f = outer(2)
th
a
M
à f is now the inner function that closes over a with a value of 2

©
à a is called a free variable of the closure f

h t
à we can call f
r i gf() à 20

p y
C o
© MathByte Academy 474
y
e m
à but there are some rules!
a d
A
à you can always "read" a variable from the outer scope
c
te
def outer():
By reading c automatically uses
the one in the outer scope
c = 100
def inner(): th
outer scope d = c * 10 a
return d M inner scope

f = outer()
return inner
t ©
g h
r i
outer à {'c': 100, 'inner': <function>}

p y
inner à closure inner, with c=100

C o
© MathByte Academy 475
y
e m
a d
à but things change if we set that symbol to a value in the inner scope

def outer(): A c
c = 100
te
here we are setting c to some value

B y
def inner():
c = 20 h
à Python ignores c from outer scope
t
return c * 10 aà creates a new symbol c in the inner scope

M (there are ways around this, but beyond


f = outer()
t © scope of this course)

g h
i
outer à {'c': 100, 'inner': <function>}

y
inner à {'c': 20}r
o p
© MathByte Academy C 476
y
Example e m
def power(n):
a d
def inner(x):
return x ** n A c
te
return inner
By
squares = power(2)
th
à call power(2)
a
à power runs with n = 2 M
t ©
à inner is a function that "captures" n = 2 à a closure

h
à the closure is returned
g
r i
à squares is the closure: function inner with n=2 that takes one argument (x)

p y
o
squares(3) à 9

C
© MathByte Academy 477
y
à we can re-use power multiple times: e m
a d
def power(n):
def inner(x):
A c
return x ** n
te
return inner

By
squares = power(2)
th
[inner with n = 2]
cubes = power(3) a
[inner with n = 3]

M
squares(3) à 9
t ©
g h
cubes(3) à 27
r i
p y
C o
© MathByte Academy 478
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 479
y
e m
Sorting and Filtering c a d
A
te
B y
th

16
a
M
t ©
h
ir g
p y
C o
© MathByte Academy 480
y
e m
In this chapter we are going to focus on:
a d
à /iltering iterables
A c
à sorting iterables
te
à revisit min and max By
th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 481
y
e m
a d
Filtering A c
te
By
th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 482
y
e
filtering is the selection of a subset of items based on whether some m
condition is true or not
a d
A c
à given a list of numbers from 1 to 100, filter this list to contain even numbers only

te
à can think of it this way:
By
th
a
[1, 2, 3, 4, …, 98, 99, 100]
is_even? F, T, F, T, …, T,
M F, T

t ©
h
àapply a function (is_even) to every item in the list

g
r i
à only keep items for which function returns True

p y
C o
© MathByte Academy 483
y
Predicate Functions e m
a d
A c
a predicate function is simply a function of one or more arguments that
returns True or False
te
By
for filtering in general:
th
a
à given an iterable and a predicate function
M
©
à only keep the items for which predicate function evaluates to True

h t
r i g
p y
C o
© MathByte Academy 484
y
iterable
e m
l = [1, 2, -5, 6, -1, 0]
a d
A c
predicate function

def is_positive(x):
te
return x > 0
By
th
a
l = [1, 2, -5, 6, -1, 0] M
/ilter
pred = is_positive
t ©
g h
r i
1, 2, 6

p y
C o
© MathByte Academy 485
y
e m
à Python has a filter function that works exactly that way
a d
filter(pred, iterable)
A c
data = [1, 2, 3, -1, -2, 0]
te
B y
def is_positive(x):
th
filter(is_positive, data)
return x > 0
a à lazy iterator
M à can only iterate through this once

t © à 1, 2, 3

g h
def is_even(x):
r i filter(is_even, data)

p y
return x % 2 == 0
à 2, -2, 0

C o
© MathByte Academy 486
y
e m
à can also use a predicate function created via a lambda
a d
is_positive = lambda x: x > 0
A c
filter(is_positive, data)
te
By
th
à and often, directly inline with the call to filter:
a
M
filter(lambda x: x > 0, data)

t ©
g h
r i
p y
C o
© MathByte Academy 487
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 488
y
e m
a d
Sorting A c
te
By
th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 489
y
à looked at the sorted function before e m
a d
à given an iterable
A c
à return a list, that has been sorted
te
y
à but by what?
B
th
sorted([10, 9, 3, 1, 2, 8]) a à 1, 2, 3, 8, 9, 10

à sorting numbers is very intuitive M


t ©
à numbers have a natural sort order, and we can sort the elements
based on their values
g h
r i
y
à sorting strings was a bit more complicated
p
o
à assign an integer value to each character, and use that to sort strings

C
© MathByte Academy 490
y
à we can sort the same data in different ways e m
a d
data = [3, 1, -6, -2, -4, 5]
A c
à sort based on value
e
à [-6, -4, -2, 1, 3, 5]
t
à sort based on absolute value y
à [1, -2, 3, -4, 5, -6]
B
th
à sort based on the second digit in the square root of the absolute
value a
M
©
à maybe something more practical
t
h
à sort a collection of objects (symbol, open, high, low, close)

g
i
à by symbol
r
p y
à by open
à by high - low

C o etc…

© MathByte Academy 491


y
e m
à how do we sort by a different criteria

a d
c
à how do we sort arbitrary objects that may not even have a natural sort order
A
à approach is similar to how /ilter worked
te
à iterable
By
th
à to each element in iterable, assign a value that is used to sort
a
M
data = [3, 1, -6, -2, -4, 5]

t ©
sort by
absolute 3, 1,
g h
6, 2, 4, 5
value:
y ri 1, -2, 3, -4, 5, -6

o p 1, 2, 3, 4, 5, 6

© MathByte Academy C 492


y
e m
à just like filter used a predicate function to calculate True/False for each element

a d
à sorted can take a key function as a named argument
A c
e
à key function returns a value for each element
t
à those values have a natural sort order
By
h
à usually numbers, but does not have to be
t
data = [3, 1, -6, -2, -4, 5] a
M
def sort_key(x):
return abs(x)
t ©
g h
r i
sorted(data, key=sort_key)

y
sorted(data, key=lambda x: abs(x))
p
o
sorted(data, key=abs)

C
© MathByte Academy 493
y
e m
each element of the iterable
a d
à the main point is that key is just a function that returns a value for

A c
à sorted() then uses that value to sort the items in the iterable

te
data = {'a': 300, 'b': 100, 'c': 200}

By
th
à sort the keys of the dictionary based on the corresponding value
a
M
key_func(dict_key) à corresponding value

t ©
sorted(data.keys(), key=lambda k: data[k])
à ['b', 'c', 'a']
g h
r i
y
sorted(data.keys(), key=lambda k: data[k], reverse=True)
p
o
à ['a', 'c', 'b']

C
© MathByte Academy 494
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 495
y
e m
a d
min and max A c
te
By
th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 496
y
e m
à previously we saw that to get the minimum of an iterable
a d
à sort iterable from low to high
A c
à take /irst element
te
By
à similarly with maximums
th
a
M
à but we just saw that sorting always uses an associated key

t ©
à so when we talk of min and max

g h
r i
à we really have the same thing - a sort key is used

p y
C o
© MathByte Academy 497
y
e m
à min(iterable, key=<func>)
a d
à max(iterable, key=<func>) A c
te
data = [-1, 2, -3, 4, -5]
By
th
min(data) à -5
a
but this assumes a natural sort order
max(data) à 4
M (i.e. key func is an identity function – returns

t © the iterable value as-is)

h
à let's say we want the sort to be based on the absolute value
g
r i
p y
min(data, key=abs) à -1
max(data, key=abs) à -5

C o
© MathByte Academy 498
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 499
y
e m
Decorators c a d
A
te
By
th

17
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 500
y
e m
à decorators are a form of metaprogramming
a d
A c
à they allows us to wrap functionality around an already defined function
te
y
à without having to modify the code of the original function
B
th
leverages: a
à closures M
t ©
h
à functions as /irst class citizens (aka higher order functions)
g
r i
à re-assign any object to an existing symbol

p y
C o
© MathByte Academy 501
y
Why are decorators useful? e m
a d
à let's use an example to understand this
A c
te
à suppose we have a program with some functions called over and over again

à fun1, fun2, fun3, etc By


th
a
à every time one of those functions is called, we want to produce a log
M
à maybe just print to the console that the function was called

t ©
h
à we could certainly put the "logging" functionality into each function
g
r i
p y
C o
© MathByte Academy 502
y
e m
def fun1():
print('Called fun1.')
def fun2():
print('Called fun2.')
a d
… …
A c
def fun3(): def fun4():
te
print('Called fun3.')
… …
By
print('Called fun4.')

th
à repeating the same code multiple times a
M
à what if we want to include date/time call was made

t ©
à go back and edit logging code inside each function

g h
i
à 3 weeks later, oh yeah, add some timing to it too

y r
à go back and edit logging code inside each function

o p
à too much typing! à error-prone à easy to be inconsistent

© MathByte Academy C 503


y
e m
à instead want to write the logging code once
a d
à and "apply" it to each function we want to log
A c
te
By
à basically we want to build a second function that will:

th
à run some code
a
M
à execute the original function with the arguments that were passed in

à run some code


t ©
g h
à return the result of the call
r i
p y
C o
© MathByte Academy 504
y
à when we call fun1, fun2, etc
e m
a d
fun1(10, 20) à start timing
à result = fun1(10, 20) A c
te
à stop timing

By
à log call, date/time and timing
th
à return result
a à using some common code

à start timing M
©
fun2(10)

t
à result = fun2(10)
h
r i g
à stop timing

p y à log call, date/time and timing

C o à return result

© MathByte Academy 505


y
e m
a d
Decorators A c
te
By
th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 506
y
à recall nested functions e m
a d
def outer():
def inner():
A c

te
return inner

By
calling outer()
th
à returns the function inner
a
f = outer()
M
f()
©
à this called inner returned by outer
t
g h
r i
p y
C o
© MathByte Academy 507
y
à using closures, we can do this: e m
a d
def outer(fn):
def inner():
A c
def hello():
return 'Hello!'
print(f'Calling {fn}…')
te
result = fn()
return result
B y fn is a free variable, from outer scope
return inner
th
f = outer(hello) a
à inner function is created
M
à it is a closure with fn pointing to hello

t ©
h
f() à calls inner, with fn pointing to hello

g
i
à this calls hello()
r
y
à and returns the result of that call
p
C o
© MathByte Academy 508
y
e m
function is passed as an argument to outer a d
à so outer can create and return a function that will execute whatever

def outer(fn):
A c
def inner():
te
print(f'Calling {fn}…')
result = fn()
By
return result
th
return inner
a
M
f = outer(fun1)
f = outer(fun2) t ©
f() à will call fun1 (and maybe do some extra things)
f() à will call fun2 (and maybe do some extra things)

g h
i
f = outer(fun3) f() à will call fun3 (and maybe do some extra things)

y r
à but we also want to be able to pass arguments to fun1, fun2, fun3

o p
à so pass them to inner and use those args to call fn

© MathByte Academy C 509


y
def outer(fn): e m
def inner(*args, **kwargs):
a d
print(f'Calling {fn}…')
result = fn(*args, **kwargs)
A c
return result
te
return inner

By
f = outer(add)
th
à f is now the inner function closure with fn à add
a
M
à notice that inner can receive any number of positional and keyword-only args

©
à whatever we pass in as arguments (when we call f())
t
h
à will be passed as-is to whatever function fn points to (add in this case)

g
r i
y
f(10, 20) à calls inner(10, 20), where fn à add
p
C o à calls add(10, 20)

© MathByte Academy 510


y
e m
def outer(fn):
a d
def inner(*args, **kwargs):
print(f'Calling {fn}…')
A c
result = fn(*args, **kwargs)
te
return result
return inner
By
th
à can think of this as a wrapper for fn
a
M
à we call outer(fn) to create a new function that wraps fn

t ©
à we can call that new function with arguments

g h
à it does it's own thing (like print in this example)
r i
y
à but it also executes fn, with whatever arguments we pass in

o p
à and returns the result of that call

© MathByte Academy C 511


y
def log(fn):
e m
def inner(*args, **kwargs):
a d
c
print(f'Calling {fn}…')
result = fn(*args, **kwargs)
return result
A
return inner
te
à we actually have a very simple logger here

à suppose we have some functions


By
th
def add(x, y):
return x + y a
def greet(name):
return f'Hello {name}!'
M
run the logging code t ©
à we can create new functions that will perform the same task and also

g h
r
add_logged = log(add)i
p y
greet_logged = log(greet)

C o
© MathByte Academy 512
y
à so now, instead of calling add
e m
à call add_logged
a d
à if we need to change logging format
c
à do it in just one place!
A
e
à but we must change all calls to add to now be add_logged
t
à yikes!
By
th
a
à remember that Python is a dynamic language

M
à we can re-assign any object to any symbol

t ©
à instead of: add_logged = log(add)

g h
à how about: add = log(add)

r i
y
à the symbol add now points to the new function (not the original add)
p
o
à which will call the original function object

C
© MathByte Academy 513
y
à that's the basic decorator pattern e m
a d
def wrapper(func):
def inner(*args, **kwargs):
A c
# some code here
te
result = func(*args, **kwargs)
# some code here
B y
return result
th
return inner
a
M
def func(a, b):
t ©

g h
r i
y
func = wrapper(func)

p
C o à wrapper is called a decorator

© MathByte Academy 514


y
e m
à this is so common
a d
def func(a, b):
… A c
te
func = wrapper(func)
B y
th
à there is a shorthand notation!
a @wrapper

M def func(a, b):


t © à exactly same as above

g h
i
def func(a, b):

y r …

o p func = wrapper(func)

© MathByte Academy C 515


y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 516
y
e m
a d
LRU Caching A c
te
By
th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 517
y
à this is a really interesting application of decorators
e m
a d
à it solves the following problem
you have some function that gets called often A c
te
y
à the same set of arguments are used often
B
à the function is deterministic
th
a
à calls with the same arguments should produce the same result

M
à re-calculating the function is fairly costly

t ©
h
à we could use a caching mechanism
g
r i
à /irst time a set of arguments is encountered, calculate result

y
à store result in a cache
p
o
à subsequent calls with same arguments, recovers result from cache

C
© MathByte Academy 518
y
à basic idea is this: e m
cache = {}
a d
def func(a, b, c):
A c
e
key = (a, b, c)
if key in cache:
return cache[key] y t
# calculations here
h B
return result a t
cache[key] = result # add result to cache

M
à result is calculated t ©
à first time we call func with func(1, 2, 3)

g h
i
à result is inserted into cache dictionary using the key (1, 2, 3)
r
y
à next time func(1, 2, 3) is called, result is returned directly from cache dictionary

p
o
à I will show you in code how we could try to do this ourselves using decorators

C
© MathByte Academy 519
y
e m
d
à Python has such a decorator – the lru_cache decorator

LRU àLeast Recently Used


c a
A
à caches should not grow inde/initely
te
à so keep the n most recent
By
th
a
M
à works well when most recent calls are good predictors of upcoming calls

©
à can specify the cache size we want
t
h
à maxsize positional argument
g
r i
p y
à None means unbounded
à otherwise specify an int to set max cache size

C o
© MathByte Academy 520
y
e m
from functools import lru_cache

a d
@lru_cache(maxsize=20)
def my_func(a, b): A c

te
By
à uses a decorator
th
a
à this decorator can also take arguments
M
à there is a restriction
t ©
h
à the arguments passed to the function must be hashable values
g
r i
à that's because they are used as a key in the cache dictionary

p y
C o
© MathByte Academy 521
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 522
y
e m
Text Files c a d
A
te
By
th

18
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 523
y
e m
à opening text /iles
a d
à read data from them
A c
à write data to them
te
By
h
à remembering to close the /ile when we're done!
t
a
à a mechanism to make sure we never forget
M
à context managers
t ©
g h
r i
p y
C o
© MathByte Academy 524
y
What are contexts and context managers? e m
a d
à not going to cover how to create our own
c
à but will use some
A
te
à a context is an area of code that is entered and exited

By
à it is entered by "calling" a context manager using a with statement

th
a
à it is exited when the with code block is exited

M
the context manager is responsible for

©
à running some code on entry
t
h
à running some on exit
g
r i
p y
C o
© MathByte Academy 525
y
e m
a d
starts a context
A c
using this context manager

with open_database_connection(…) as conn:


te
# work with open conn
By context manager's job is to
h

a t open a connection upon


entry
# once the with block is exited,
# M
the connection is automatically closed

t ©
g h and to close the connection

r i upon exit

p y
C o
© MathByte Academy 526
y
e m
a d
Reading Text Files A c
te
B y
th
a
M
t ©
h
ir g
p y
C o
© MathByte Academy 527
y
Opening Files e m
a d
c
à to read or write a text /ile, we /irst need to open the /ile
A
open(file_path)
te
path to /ile you want to open

By
can be absolute, or relative to
where the Python app is running

th
a
à need to tell Python how we want to interact with the file, the mode of operation

open(file_path, mode) M
t ©
à r: read-only (default)

g h
à w: write-only, create new /ile, or overwrite if it exists

r i
y
à a: write-only, create new /ile, or append if it exists

p
C o
© MathByte Academy 528
y
à what is returned by open()? e m
a d
à an object that has many methods and properties
A c
e
à readlines()
à closed à is /ile closed?
y t
à close()
B
à this allows us to close the /ile after we're done with it
h
à but it is also an iterator a t
M
à provides iteration over the individual lines in the text file
à next
t ©
à for loop etc…

g h
i
à technically we can reset the "play head", but beyond scope
r
of this course
y
o p
à just think of it as an iterator

© MathByte Academy C 529


y
Closing Files e m
a d
à always close a /ile after you're done with it
A c
te
à releases the resource (not unlimited number of open files)

By
à writes are often buffered until the /ile is closed

th
a
M
f = open('file path', 'w')
# write to file
f.close()
t ©
g h
r i
p y
C o
© MathByte Academy 530
y
Closing Files e m
a d
c
à but what if an exception occurs while the file is open?
A
e
à use a try…finally… to always close the file, no matter what
t
By
f = open('file path', 'w')
try: th
# write to file a
finally: M
f.close()
t ©
g h
r i
à that's one approach

p y
C o
© MathByte Academy 531
y
open() as a Context Manager e m
a d
à open() is also a context manager
A c
te
with open('file_path', 'w') as f:
# write to file
B y
th
a
à as soon as the context exits, /ile is closed

M
à even if an unhandled exception occurs in context block

t ©
g h
r i
p y
C o
© MathByte Academy 532
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 533
y
e m
a d
Writing Text Files A c
te
B y
th
a
M
t ©
h
ir g
p y
C o
© MathByte Academy 534
y
e m
à same principle as reading text /iles
a d
à open /ile (in write mode)
A c
à write to file
te
à close file
By
th
à especially important for writes, since changes may be lost otherwise

a
M
à best is to use a context manager

t ©
g h
r i
p y
C o
© MathByte Academy 535
y
Write modes e m
a d
àw open('<file_path>', 'w')
A c
à creates /ile if it does not exist
te
By
à overwrites (clears out) /ile if it already exists
th
a
àa open('<file_path>', 'a') M
t ©
à creates file if it does not exist

g h
i
à appends to end of file if it already exists
r
p y
C o
© MathByte Academy 536
y
Writing Text To File e m
a d
à f.write(<some string>)
A c
à writes speci/ied string to /ile
te
y
à it does not add a \n character automatically
B
th
à have to do that ourselves if we need it

a
M
©
à f.writelines(<iterable of strings>)

h t
à writes each string in iterable to /ile

r i g
à it does not add a \n character automatically after each string

p y
à have to do that ourselves if we need it

C o
© MathByte Academy 537
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 538
y
e m
Modules and Imports c a d
A
te
B y
th

19
a
M
t ©
h
ir g
p y
C o
© MathByte Academy 539
y
What happens when amount of code becomes very large?
e m
(e.g. lots of functions)

a d
à sometimes our code needs to grow beyond a single /ile or a Jupyter notebook

à need to break up code into multiple /iles A c


te
y
à each /ile can group similar or related functionality together

B
h
à code in one file (like a function) should be available to the other files
t
a
M
à in Python these code files are called modules

©
à modules can be nested within other modules

h t
à modules that contain other modules are called packages

r i g
à creating packages is beyond scope of this course

p y
à but we should know what they are and how to use existing ones

C o
© MathByte Academy 540
y
Built-Ins e m
a d
à Python has many built-in object types and functions
A c
à bool, int, float, str, list, tuple, dict
te
By
à print(), filter(), sorted(), zip(), len(), etc

th
à these are baked right into Python a
à they're always available M
t ©
à we don't have to do anything special to use them

g h
r i
y
https://docs.python.org/3/library/functions.html
p
C o
© MathByte Academy 541
y
Standard Library e m
a d
A c
à Python has a lot of libraries (modules and packages) that come
standard with base Python installation
te
y
à we have to speci/ically tell Python we want to use them
B
t
à we "load" them using an import statement
h
a
M
why not just load (import) everything always?
à there is a ton of libraries
t ©
h
à do you really want to load up thousands of libraries into memory
g
i
for things you don't even need?
r
y
à other reasons too, which we'll see as we work with packages
p
o
during the remainder of this course

© MathByte Academy C 542


y
Standard Library e m
a d
that cover things like:
A c
Python provides a huge selection of libraries (modules and packages)

à numerical and math


te
à math functions
à stats functions
B y
à random numbers
th
a
à Decimal objects (alternative to /loats)
à date and time M
à CSV files
t ©
à cryptography
g h
yri
à networking, internet and many, many more…

o p https://docs.python.org/3/library/index.html

© MathByte Academy C 543


y
3rd Party Libraries e m
a d
c
à sometimes the standard library is insuf/icient or too cumbersome
à standard library has to be as generic as possible
A
te
à it may provide the basic building blocks to do something
y
à but you may need to write a lot of functions/code to tie them together
B
th
à many developers create libraries that leverage the standard library (or

a
other 3rd party libraries), but provide a higher level, easier to use interface to
more specialized functionality
M
à sometimes 3rd
t ©
party libraries have a very narrow and speci/ic focus
à performance or advanced functionality might be one reason

g h
i
à NumPy à SciPy à QuantLib

y r
à Pandas à Matplotlib à scikit-learn

o p
© MathByte Academy C 544
y
3rd Party Libraries e m
a d
A c
à we can install those libraries, and import them like any package

te
y
à where do you /ind a list of available 3rd party libraries
B
th
à most exhaustive source is PyPI (Python Package Index)
a https://pypi.org/
M
©
à they can be installed using pip install that we saw in the beginning

t
g h
à more than 220,000 libraries
!
y ri
o p
© MathByte Academy C 545
y
3rd Party Libraries e m
a d
à how to /ind the "good" ones?
A c
e
à read blogs, posts, books, web sites and see what other people are using
t
à does it have good documentation?
By
à is it still actively developed?
th
à is it widely used?
a
M
à but maybe your need is extremely speci/ic and very niche

t ©
à maybe something not so great is there that will work as a starting point

g h
i
But don't just look for a 3rd party library for everything you write!
r
à if 3rd
y
party library is full of bugs and unsupported, life will be painful!
p
o
à write your own code – often it is far simpler!

C
© MathByte Academy 546
y
e m
à this course cannot cover all these specialized libraries
a d
à we will look at some
A c
te
à by the end of this course you will have a solid foundation

y
to easily understand and use these specialized libraries
B
th
a
à Official Python docs and library docs are you best source of information

M
à blog posts and similar online resources can be very helpful (unless they're
just plain wrong!)

t ©
à stackover/low is a fantastic resource for getting questions answered

g h
i
https://stackoverClow.com/

y r
à but at some point you will need to look into the of/icial docs

o p
à start now!

© MathByte Academy C 547


y
e m
a d
Basic Imports A c
te
By
th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 548
y
e m
a d
à Python has quite a few built-in functions and data types (classes)

c
https://docs.python.org/3/library/functions.html
A
à built-ins are always available
te
à they are essentially "pre-loaded"
By
th
à but there's a lot more in Python's standard library
a
à way too much to load everything all the time
M
à even more so with third party libraries

t ©
à so we need to load those as needed

g h
r i
à all this functionality is split up into separate modules or packages

p y
à think of a module as a code /ile

o
à we then load just the modules we need

C
© MathByte Academy 549
y
Loading a Module e m
a d
à modules are just objects
A c
à we need to "create" the object
te
y
à we need to assign a symbol (variable) to that object
B
th
à we can then use the variable to reference the module object

a
à which has properties, functions, other objects

the import statement is used to both M


t ©
à load the module (create the module object)

g h
à assign a symbol to the object

r i
p y
C o
© MathByte Academy 550
y
Example e m
a d
import math
A c
math is a module in the standard library for
math related functionality

te
y
à the math module has been loaded (from file)
B
h
à and the variable (symbol) math is a reference to that module object
t
a
M
à math contains many functions, such as sqrt
à like with any object, we use dot notation to reach inside the object

t ©
math.sqrt(2) à 1.414…
g h
y ri access the sqrt function inside that object using .

p
this symbol points to the math module object

o
© MathByte Academy C 551
y
Aliasing e m
a d
import some_module
A c
à loads some_module
te
y
à creates a variable of the same name that references that module

B
th
what if we want to name that symbol something else?
a
import some_module as sm
M
à loads some_module
t ©
h
à creates a variable sm that references the some_module object
g
r i
p y
C o
© MathByte Academy 552
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 553
y
e m
a d
Import Variants A c
te
By
th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 554
y
so far we have seen two variants of the import statement e m
a d
import some_module
A c
e
import some_module as alias

y t
à if we want to use something inside the module, we have to use dot notation

h B
fractions.Fraction(1, 2)
a t
M
fractions.Fraction(1, 4)

©
à what if we just want to use Fraction inside fractions
t
h
à can we avoid using fractions.Fraction all the time?
g
à yes!
r i
p y
C o
© MathByte Academy 555
y
e m
à we can import symbols from a module directly into the corresponding
symbols in our code
a d
from fractions import Fraction
A c
à the fractions module is loaded
te
By
à but the symbol fractions is NOT added to our local variables
th
a
à instead the Fraction symbol is added to our local variables

M
à references the Fraction property inside the fractions module

f1 = Fraction(1, 2) t ©
g h
r i
p y
C o
© MathByte Academy 556
y
à can do the same with any module e m
a d
from math import sqrt
sqrt(2) A c
te
y
à what if we want more than one attribute from the module?
B
th
from math import sqrt, pi, factorial
a
M
à sqrt, pi, and factorial are now available as symbols in our local scope
sqrt(2)
t ©
g h
2 * pi
r i
p
factorial(5) y
C o
© MathByte Academy 557
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 558
y
e m
Dates and Timesca d
A
te
B y
th

20
a
M
t ©
h
ir g
p y
C o
© MathByte Academy 559
y
Fundamental Concepts e m
a d
à time zones and UTC
A c
à epoch times
te
à times without dates By
th
à dates without times a
M
à dates with times

t ©
à ISO 8601 Standard
g h
r i
p y
C o
© MathByte Academy 560
y
Coordinated Universal Time e m
a d
à UTC
A c
e
à sometimes still referred to as GMT (Greenwich Mean Time)
t
à world standard
By
à no adjustments for daylight saving time
th
a
M
à easiest is to always use UTC internally in our programs

©
à convert incoming times to UTC
t
g h
à work exclusively in UTC internally

r i
y
à display to user using their preferred time zone
p
C o
© MathByte Academy 561
y
e m
a d
program
A c
incoming
te
datetime data
(string)
parser
y
UTC converter

B
th
d = <datetime object in UTC>

a
M
t ©
g h
r i
p y
C o
© MathByte Academy 562
y
Challenges with external sources of time data e m
a d
c
à Python has special data types, for time, date and datetime
A
te
à external sources of time data usually given as strings

B y
à it is a visual (string) representation of a date/time
th
à but what format?
a
M
3/1/2020 2:35:01 pm à is this March 1, or January 3?

©
March 1, 2020 14:35:01
t
h
03-01-2020 02:35:01 PM

ir g
à what time zones? à may or may not be specified, using different "standards"

y
à how do we convert these to UTC based date/times in our apps?
p
C o
à what formatting should we use?

© MathByte Academy 563


y
ISO Format e m
a d
c
à ISO 8601 defines standards for string representations of dates and times
A
te
time zone
date
YYYY-MM-DD T hh:mm:ss[.nnnn]
time

By
(offset from UTC)
±hh:mm

th sometimes : is omitted
separator a
24-hour clock M offset, in hours and

t ©
optional fractional seconds
minutes (positive or
negative) from UTC

g h
i
2-digit months: 01, 02, 12
r
optional!

y
2-digit days: 01, 02, 28, 29
p
Z is often used instead of 00:00

C o à so Z means UTC time zone

© MathByte Academy 564


y
May 1, 2020, 10:23:35am in Eastern Time e m
à daylight savings time in effect (EDT)
a d
2020-05-01T10:23:35-04:00

December 1, 2020, 10:23:35am in Eastern Time A c


te
y
à daylight savings time NOT in effect (EST) 2020-12-01T10:23:35-05:00
B
th
à keeping track of all this in calculations is dif/icult!
a
à convert to UTC first
M
2020-05-01T10:23:35-04:00
t © à 2020-05-01T14:23:35Z

g
2020-12-01T10:23:35-05:00 h à 2020-12-01T15:23:35Z

r i
y
à then convert to whatever time zone for display purposes to user (if necessary)
p
C o
© MathByte Academy 565
y
e m
d
à Python has a lot of functionality for calculations with dates and times
a
c
à to minimize introducing bugs, always use UTC based times
A
e
à but converting these "input" times to UTC is dif/icult!
t
y
à can be done using Python and the standard library
B
th
à much easier to leverage 3rd party libraries for this

a
M
easier way to deal with:
à dateutil
- parsing string
à pytz
t © - time zones

g h
i
à we'll look at these later in this course

y r
o p
© MathByte Academy C 566
y
Epoch Time e m
a d
à we saw that dealing with date/times involves time zones (whether
UTC or something else)
A c
te
à introduced by Unix as a way to de/ine a datetime without using timezones
à start with a base datetime
By
à the epoch

th
à given a datetime, calculate it as the difference in seconds from the epoch
a
M
à also called Unix or POSIX time
à epoch is system dependent

t ©
h
à Usually: January 1, 1970 00:00:00 UTC

g
r i
2020-05-01T10:23:35-04:00 à 1588343015.0

p y
à but if ingesting datetime information that use epoch times,

C o
you need to know the epoch!

© MathByte Academy 567


y
The time Module e m
a d
à used for time manipulations
A c
à mostly uses epoch times
te
à we won't use this much
By
th
a
à but it also includes some useful functions

à sleep M
à perf_counter
t ©
g h
r i
p y
C o
© MathByte Academy 568
y
The datetime Module e m
a d
c
à used for date (only), time (only) and datetime (date with time) objects
A
à can handle time zones
te
à provides formatting and parsing capabilities
By
à de/ines a timedelta data type (class)
th
a
M
à used to represent time difference between two date/time objects

t ©
g h
r i
p y
C o
© MathByte Academy 569
y
e m
a d
The time Module A c
te
B y
th
a
M
t ©
h
ir g
p y
C o
© MathByte Academy 570
y
perf_counter e m
a d
A c
à perf_counter is used to measure elapsed time in (float) seconds
à from some undefined start (0)
te
(usually when program starts running)

By
à always look at difference between calls to perf_counter

th
à uses a clock with highest available precision
a
from time import perf_counter M
t1 = perf_counter()
t ©
g h
i
t2 = perf_counter()

y r
elapsed = t2 – t1

o p
© MathByte Academy C 571
y
sleep e m
a d
A c
à sleep(n) is used to pause execution for (/loat) n seconds

t
à why would you want to slow your program down??e
By
h
à give time for something else to finish

à usually some external resource


a t
M
à maybe a network connection is temporarily down

t ©
h
à retry connecting a few times, but wait in-between retries

r i g
p y
C o
© MathByte Academy 572
y
Getting the epoch e m
a d
à Unix systems use January 1, 1970, 00:00:00 (UTC)

A c
time.gmtime(n)
te
à returns a time object (struct_time)

y
à based on n seconds elapsed from epoch
B
h
à has the following properties:
t
a
tm_year, tm_mon, tm_day

M
tm_hour, tm_min, tm_sec + a few more…

©
à ignores fractional seconds (if /loat)
t
h
à to /ind the epoch on your system
time.gmtime(0)
ri gà struct_time(tm_year=1970, tm_mon=1,

p y tm_mday=1, tm_hour=0, tm_min=0,


tm_sec=0, …)

C o
© MathByte Academy 573
y
Getting the current epoch time e m
a d
c
time.time() à returns the current time (in seconds) since the epoch
A
te
à get UTC time_struct from that
By
time.gmtime(time.time())
th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 574
y
Converting from time_struct to epoch time e m
a d
c
à gmtime(n) converts an epoch time n to a time_struct
A
te
y
à can also convert a time_struct back to an epoch time

B
calendar.timegm(time_struct)
th
a
M
à timegm is the inverse of gmtime

©
à it is located in the calendar module

t
from calendar import timegm
h
ir g
n = 1_000_000_000

y
t = gmtime(n) struct_time(tm_year=2001, tm_mon=9, …)

o p
timegm(t) à 1_000_000_000

© MathByte Academy C 575


y
Formatting epoch time to human readable stringe m
a d
A c
à if we show someone an epoch time (a /loat), that does not mean much to them
à as humans we are used to certain formats for the date and time

te
à use the strftime(format, time_struct) function (string format time)

B y
à format is a string that contains special formatting directives

th
a
à for example, suppose we have an epoch time: 1587253022
M (which is actually 2020-04-18 23:37:02)

t ©
à we can format this time into April 18, 2020 as follows:

h
t_struct = gmtime(1587253022)
ir g
strftime("%B %d, %Y", gmtime(t_struct))

p y à "April 18, 2020"

C o
© MathByte Academy 576
y
here are a few more format directives:
e m
a d
https://docs.python.org/3/library/datetime.html#strftime-and-strptime-format-codes

%Y à four digit year


A c
%y à two digit year
%m à month number
te
%B à month full name

%d à day of the month number


B y
%b à month abbreviated name

th
%H à hour in 24-hour clock
a%I à hour in 12-hour clock

%M à minute number M %p à AM or PM

%S à second number
t ©
g
%z à time zone offset ±HHMM h %Z à time zone name

r i
y
%w à weekday number (Sunday=0) %A à weekday full name
p
C o %a à weekday abbreviated name

© MathByte Academy 577


y
à epoch time t = 1587253022 (2020-04-18T23:37:02) e m
a d
from time import strftime, gmtime
t_struct = gmtime(t) A c
te
By
h
strftime("%Y-%m-%dT%H:%M:%Sz", t_struct)

a t
à '2020-04-18T23:37:02z'

M
©
strftime("Today is %A, %B %d, %Y", t_struct)

t
g h
à 'Today is Saturday, April 18, 2020'

r i
y
strftime('Time: %I:%M %p %Z', t_struct)
p
C o à 'Time: 11:37 PM UTC'

© MathByte Academy 578


y
Parsing Date/Time Strings e m
a d
à this is the reverse of the formatting we just saw
A c
given a string such as: "04/18/2020 11:37:02 PM"
te
y
à "convert" it to an epoch time
B
h
à we'll assume the time was given in UTC (since no indication was given)
t
a
à also assume format is Month/Day/Year (not Day/Month/Year)
M
à in this case we can safely assume this, since there is no month 18
à not always that lucky!
t ©
g h
i
à we need to tell Python what to expect in the string, using same directives as before
r
p y
time.strptime(date_string, format) (string parse time)

C o
© MathByte Academy 579
y
from time import strptime e m
a d
c
s = "04/18/2020 11:37:02 PM"

strptime(s, "%m/%d/%Y %I:%M:%S %p") A


te
à time.struct_time(
tm_year=2020, By
tm_mon=4,
th
tm_mday=18, a
tm_hour=23,
M
tm_min=37,
tm_sec=2,
t ©
tm_wday=5,
g h
i
tm_yday=109,
r
)
p y
tm_isdst=-1

C o
© MathByte Academy 580
y
e m
a d
c
à for every date/time formatting variant, we have to specify the format to parse it
A
4/18/20 23:45:34
e
"%m/%d/%y %H:%M:%S"
t
18/04/2020 11:45:34 PM
y
"%d/%m/%Y %I:%M:%S %p"
B
20/4/18 11:45:34 PM
th "%y/%m/%d %I:%M:%S %p"

a
à this can get difficult M
©
à especially if our various data sources use a mixture of formats
t
h
à this is where 3rd party libraries, such as dateutil can help
g
r i
à we'll come back to that later…

p y
C o
© MathByte Academy 581
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 582
y
e m
a d
The datetime Module A c
te
B y
th
a
M
t ©
h
ir g
p y
C o
© MathByte Academy 583
y
à the time module is a low-level library e m
à good for working with epoch times
a d
à but a bit cumbersome
A c
à not a ton of functionality
te
By
à instead use the datetime module
th
a
à isolates us from epoch times (used internally)

M
àprovides handy data types (classes)
à date
t ©
à time

g h
à datetime
r i
y
à timedelta
p
C o
à timezone

© MathByte Academy 584


y
datetime.date e m
a d
c
à date is a data type (class) for working with pure dates (no times)
A
from datetime import date
te
date(year, month, day)

By
th
(or import datetime module, and use fully quali/ied names)
a
M
import datetime
datetime.date(year, month, day)

t ©
à properties:

g h
.year
r i
.month
.day
p y
C o
© MathByte Academy 585
y
datetime.date e m
a d
date.today()
A c
à returns local date as a date object
te
By
<date_obj>.toisoformat()
th
a
à returns an ISO 8601 string for the date object

M
©
date.fromisoformat("iso formatted date string")
t
h
à parses and creates a date object from an ISO formatted date string
g
r i
p y
C o
© MathByte Academy 586
y
datetime.time e m
a d
à it can be time zone naı̈ve, or aware A c
à time is a data type (class) used to work with pure times (no date)

te
y
à time(hour, minute, second, microsecond, tzinfo)
B
th
à properties: hour, minute, second, microsecond
tzinfo a
à None for naı̈ve times
M
à time.fromisoformat(s)
t ©
h
à <time_obj>.toisoformat()
g
r i
p y
C o
© MathByte Academy 587
y
datetime.datetime e m
a d
à class that supports both date and time
A c
datetime(year, month, day,
te
hour, minute, second, microsecond,
tzinfo)
By
th
à properties for year, month, …
a
à datetime.datetime.fromisoformat(s) M
t ©
h
à <datetime_obj>.toisoformat()
g
r i
à datetime.datetime.utcnow()

p y
à returns naı̈ve local date/time in UTC

C o
© MathByte Academy 588
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 589
y
e m
a d
Date Arithmetic A c
te
By
th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 590
y
e m
à date arithmetic mostly involves working with
a d
à dates, times, datetimes
A c
e
à time durations (e.g. 1 day and 2 hours and 30 minutes and 15 seconds)
t
By
datetime module has a special class for durations

th
à timedelta
a
M
©
à subtracting one date/time from another results in a timedelta
t
h
à can add or subtract a timedelta from a date/time
g
r i
p y
C o
© MathByte Academy 591
y
datetime.timedelta e m
a d
timedelta(days,
seconds, microseconds, milliseconds,
A c
minutes,
te
hours,
weeks)
By
th
à arguments are optional and default to 0a
à argument values are additive M
t ©
timedelta(days=1, hours=1) à duration of 1 day and 1 hour

g h à 25 hours

r i
p y
C o
© MathByte Academy 592
y
datetime.timedelta e m
a d
timedelta(days,
seconds, microseconds, milliseconds,
A c
minutes,
te
hours,
weeks)
By
th
à notice there is no month argument a
M
à what does it mean to add a month to a date???

t ©
à 31 days, 30 days, 29 days, 28 days???

g h
r i
1/15/2020 + 1 month à 2/15/2020?

y
1/31/2020 + 1 month
p
à 2/31/2020???

C o
© MathByte Academy 593
y
datetime.timedelta e m
a d
à most arguments in timedelta() are for convenience
A c
à internally timedelta objects store the values in days, seconds, and microseconds
te
à properties .days .seconds
By
.microseconds

th
a
<timedelta_obj>.total_seconds()
M
à returns the total number of seconds (fractional /loat) in duration

t ©
g h
r i
p y
C o
© MathByte Academy 594
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 595
y
e m
a d
Naı̈ve and Aware Times A c
te
B y
th
a
M
t ©
h
ir g
p y
C o
© MathByte Academy 596
y
e m
à aware time à time has a time zone attached to it
a d
à naı̈ve time à no time zone info
A c
te
y
to simplify our coding life, we made two decisions:
B
à all times we create/work with will be
th
a
M
à naïve
à in UTC

t ©
h
à the idea is that any datetime we ingest, immediately gets transformed
g
r i
into a naïve UTC datetime

p y
à convert to aware non-UTC for display or output purposes only

C o
© MathByte Academy 597
y
timezone De@inition e m
a d
datetime.timezone
A c
à class to de/ine a time zone
à name (optional)
te
à UTC offset
B y à de/ined as a timedelta object

how is that offset defined exactly? th


a
M
à time zone offset de/ines the number of hours and minutes that should
be added to or subtracted from the corresponding UTC time

t ©
if a time zone is 4 hours "behind" UTC, then the offset is -4 hours

g h
r i
tz_EDT = timezone(timedelta(hours=-4), 'EDT')

p y
à pre-de/ined UTC timezone: timezone.utc

C o
© MathByte Academy 598
y
Aware datetimes e m
a d
from datetime import datetime, timezone, timedelta
A c
te
y
d1 = datetime.fromisoformat('2020-05-15T13:30:00-05:00')

B
th
tz_EDT = timezone(timedelta(hours=-4), 'EDT')
a
d2 = datetime(year=2020, month=5, day=13,
M
hour=13, minute=30, second=0,
tzinfo=tz_EDT)
t ©
g h
r i
p y
C o
© MathByte Academy 599
y
Converting from one Time Zone to Another e
m
a d
c
if we have an aware datetime, we can easily change it to another time zone
A
te
à use the .astimezone(target_tz) method of the datetime object

B y
d1 = datetime.fromisoformat('2020-05-15T13:30:00-04:00')

th
a
tz_CDT = timezone(timedelta(hours=-5), 'CDT')

d1.astimezone(tz_CDT) M
t ©
à datetime(2020, 05, 15, 12, 30, 00, tzinfo=tz_CDT)

h
ir g
d1.astimezone(timezone.utc)

y
à datetime(2020, 05, 15, 17, 30, 00, tzinfo=timezone.utc)

o p
à notice that the datetime objects remain aware

© MathByte Academy C 600


y
Adding or Removing Time Zone e m
a d
A c
à careful! Do not remove time zone from a non-UTC timestamp!
à unless you know what you are doing and this is intentional
te
everything is UTC By
à ok to remove from a UTC aware timestamp – since we assume

th
a
M
à to make a UTC aware timestamp naı̈ve, just replace the tzinfo value
with None

t ©
à to add a time zone to a naı̈ve timestamp, replace tzinfo with appropriate
timezone
g h
r i
y
à use the .replace() method on datetime objects
p
C o
© MathByte Academy 601
y
The replace() method e m
a d
if dt is some datetime object
A c
e
à create a new datetime object with the exact same values:
t
dt_copy = dt.replace()
By
th
a
à or replace one or more values while we do the copy

M
dt_copy = dt.replace(year=2021, hour=0)

t ©
h
à in particular, we can do that with the tzinfo value
g
r i
dt.replace(tzinfo=None)

p y
dt.replace(tzinfo=timezone.utc)

C o
© MathByte Academy 602
y
Daylight Savings Time e m
a d
à not everyone does A c
à many places change their clock twice a year – daylight savings time

à Most parts of Arizona do not, but some do!


te
à not everyone does it at the same time
By
à not everyone changes by the same amount
th
a
à when and how much has changed over the years for the same places
M
t ©
so, how do we convert a UTC datetime into some specific time zone?
à it must take all these things into account à dif/icult!

g h
i
à Olson Database (or IANA time zone database)
r
p y
https://en.wikipedia.org/wiki/Tz_database
à the pytz 3rd party library à covered later

C o
© MathByte Academy 603
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 604
y
e m
a d
Custom Representations A c
te
B y
th
a
M
t ©
h
ir g
p y
C o
© MathByte Academy 605
y
e m
à recall the time module
a d
à strftime()
A c
te
à format a time struct using formatting directives

By
strftime('Time: %I:%M %p %Z', t_struct)

th
a
M
à strptime()

©
à parse a datetime into a struct using formatting directives
t
h
strptime(s, "%m/%d/%Y %I:%M:%S %p")

g
r i
p y
C o
© MathByte Academy 606
y
e m
strftime is available for:
a d
à datetime.time
à datetime.date A c
te
à datetime.datetime

By
th
strptime is available for:
a
à datetime.datetime M
t ©
g h
à uses the same special formatting directives
r i
p y
C o
© MathByte Academy 607
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 608
y
e m
The csv Module c a d
A
te
By
th

21
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 609
y
e m
à earlier we saw how to read and write text /iles

a d
c
à we even parsed some data from a simple CSV file

A
e
à but CSV formats vary, so more complicated than that simple example
t
à often called CSV dialects
By
th
à would require a lot of manual work on our part to deal with all these variants
a
M
à csv module provides functionality to read and write a wide variety of CSV formats

©
à including tab delimited, pipe (|) delimited
t
h
à can deal with different line separators
g
r i
à Unix and Windows line separators are different

p y
à \n in Unix à \r\n in Windows

C o
© MathByte Academy 610
y
e m
a d
Reading CSV Files A c
te
B y
th
a
M
t ©
h
ir g
p y
C o
© MathByte Academy 611
y
What is CSV Data? e m
CSV is a format for tabular data a d
c
à rows and columns
basic idea:
A
à each row in a /ile is a row of data
te
By
à rows in /ile are separated by a newline (OS speci/ic)
à each field in the row is separated by a separator aka delimiter

th
but that brings up a few things…
a
à what /ield separator to use?
M comma? à yes, but not necessarily

©
à how to deal with a field containing a comma (or whatever separator)?
t
g h
r i
p y
C o
© MathByte Academy 612
y
FULL_NAME,DOB,SSN e m
Smith, John,3/1/1985,123-456-789
a d
A c
actually a single /ield, but the , inside is going to cause problems

te
à use some delimiter
à maybe double quotes By
à but doesn't have to be!
th
"Smith, John","3/1/1985","123-456-789" a
M
à but we don't need the delimiters around the DOB or SSN

t ©
"Smith, John",3/1/1985,123-456-789

g h
r i
p y
C o
© MathByte Academy 613
y
e m
à what if /ield contains the /ield delimiter character?

a d
c
"Doyle, Conan","First Holmes book was the "Scarlet Letter""

A
à double up the quotes
te
y
"Doyle, Conan","First Holmes book was the ""Scarlet Letter"""
B
th
à or use a pre/ix character to "escape" the next character
à e.g. \ like Python (\n, \t, etc) a à but doesn't have to be!
M
"Doyle, Conan","First Holmes book was the \"Scarlet Letter\""

t ©
g h
à as you can see there can be many different ways of approaching this

r i
p y
C o
© MathByte Academy 614
y
CSV is not a standard format e m
a d
à unfortunately CSV is not exactly a standard
à a variety of flavors exist à dialects A c
te
à most common one is Excel
delimiter (/ield separator) à ,
B y
quotechar (/ield delimiter) à "
th
a
doubles quotechar if found inside a field
M
only uses quotechar if delimiter is found inside a field

t ©
h
à but these are other valid CSV formats too
g
ri
field1|field2|field3
y
à pipe (|) delimited
field1

o p field2
tab character
field3 à tab delimited

© MathByte Academy C 615


y
Parsing CSV Data e m
a d
à default parser dialect is excel
A c
à but we can specify custom settings for delimiter, quotechar, etc

te
csv.reader(f, delimiter=',', quotechar='"')
B y
th optional – defaults to "
a
optional – defaults to ,
an open /ile to read from
M
t ©
à returns an iterator of parsed rows over the /ile

g
with open('some_file') as f:h
r i
reader = csv.reader(f) # default uses , and "

p y
for row in reader:

o
# row is a list containing parsed fields

C
© MathByte Academy 616
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 617
y
e m
a d
Dialects A c
te
By
th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 618
y
e m
to specify CSV format
a d
à in the previous lecture we saw that we can define all kinds of settings

A c
e
à works, but if we need to repeat the same settings often
t
y
à tedious typing the same code over and over again
B
h
à error prone – might forget or mis-type one of the settings

a t
M
à instead we can bundle up all the settings into a custom dialect

©
à basically just a way to package the settings once in our program
t
h
à and re-use elsewhere in the same program multiple times
g
r i
p y
C o
© MathByte Academy 619
y
Listing Available Dialects e m
a d
à csv module comes with some pre-de/ined dialects
A c
à excel à excel-tab
te
à we can add our own to that list
By
th
à register a dialect with
a
à a name for the dialect M
t ©
à values for delimiter, quotechar, etc

g h
r i
csv.register_dialect("<name>", delimiter=…, quotechar=…, …)

p y
C o
© MathByte Academy 620
y
Using a defined Dialect e m
a d
c
à we can specify a dialect instead of individual values for csv.reader
A
te
By
csv.reader(f, dialect='excel') à excel is the default for dialect
à same as csv.reader(f)
th
a
M
à or we can specify our custom dialect we registered

©
csv.reader(f, dialect='my-custom-dialect')
t
g h
r i
p y
C o
© MathByte Academy 621
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 622
y
e m
a d
More Examples A c
te
By
th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 623
y
e m
a d
c
à in this lecture we are going to process two more CSV /iles

A
à one with some NASDAQ data (nicely formatted)
te
y
à an older file from the US Census Bureau (oddly formatted)
B
th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 624
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 625
y
e m
a d
Writing CSV Files A c
te
B y
th
a
M
t ©
h
ir g
p y
C o
© MathByte Academy 626
y
e m
à reverse of reading and parsing a CSV /ile
a d
à given some data, write it out to a CSV /ile
A c
à an iterable of rows
te
y
à each row is itself an iterable of fields (columns)
B
th
a
à just like reading a CSV /ile, we can specify formatting options

M
à either using individual values (delimiter, quotechar, etc)

t ©
à or using a dialect (built-in or custom)

g h
r i
à unless there are some reasons not to, just use the standard excel dialect

p y
C o
© MathByte Academy 627
y
Writing a CSV File e m
a d
à use csv.writer
A c
à then use the writerow method to write out each row in your data

te
with open('<file_name>', 'w') as f:
By
h
writer = csv.writer(f, dialect='…')
for row in data:
writer.writerow(row) a t
M
à where data is an iterable containing iterables of fields
data = [
t ©
[row1_col1, row1_col2, …],

g h
[row2_col1, row2_col2, …],

r i
]
p y
o
à if you want a header row, write that out too using writerow and an iterable of headers

C
© MathByte Academy 628
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 629
y
e m
The random Module c a d
A
te
B y
th

22
a
M
t ©
h
ir g
p y
C o
© MathByte Academy 630
y
e m
random module
a d
à random number generators
à pseudo random
(integer, /loats)
A c
te
à actually generated by an algorithm

By
à gives the appearance of random number generation
(uniform distribution)
th
a
M
à Mersenne Twister algorithm

©
à PRNG (pseudo random number generator)
t
h
à deterministic generator
g
r i
à we know ahead of time what the sequence will be
y
à not suitable for security/cryptography for example
p
C o
à but suitable for most other purposes

© MathByte Academy 631


y
e m
d
à doesn't deterministic algo defeat purpose of generating random numbers?

a
à goal is to generate a sequence of numbers that
à is uniformly distributed A c
te
à appears random to user

By
t
à but if the sequence is the same every time?h
a
à that's a good thing when testing code
M
à testing and debugging random things is difficult

t ©
h
à we can make it so the generated sequence is not the same every time program runs
g
à seed value
r i à every different seed results in a different sequence

p y à use a different seed every time the program starts

C o à Python does that for us (uses the epoch time)

© MathByte Academy 632


y
e m
à beyond uniformly distributed PRN
a d
A c
à random number generator using various distributions
(normal, lognormal, triangular, beta, gamma, and more)
te
à shuf/le a sequence of elements
B y
th
à random sampling
a
à without replacement M à choose 5 cards from a deck of 52

t ©
g h
à with replacement à roll two die (each of which can be 1-6)

yri à pick two elements with


replacement from {1, 2, 3, 4, 5, 6}

o p
© MathByte Academy C 633
y
Interval Notation e m
a d
[a, b] a <= x <= b
A c
te
(a, b) a < x < b

By
(a, b] a < x <= b
th
a
[a, b) a <= x < b M
t ©
g h
à [ includes endpoint
r i
p y
à ( excludes endpoint

C o
© MathByte Academy 634
y
e m
a d
Random Numbers A c
te
B y
th
a
M
t ©
h
ir g
p y
C o
© MathByte Academy 635
y
Random seed e m
a d
c
à seed is used as a "primer" for different random number sequences
A
te
à Python automatically sets one based on system time

B y
à so every time our program restarts we get different sequences of
random numbers
th
a
M
à we can override the seed value

©
à useful to guarantee repeatability of "random" sequence
t
h
à testing, debugging
g
r i
random.seed()

p y à uses system time

o
random.seed(a) à uses value a (system time if a is None)

© MathByte Academy C 636


y
The base PRNG e m
a d
à there is a single pseudo random number generator

random.random() A c
à generates and returns the next PRN
te
B y
à float in [0.0, 1.0)
à uniformly distributed

th
à call it repeatedly to get the next number, and the next…
a
à other random related functions
M à all use this one at their base
à random integer generator
t ©
h
à random numbers that will display certain distributions (e.g. normal)

g
i
à shuffling, sampling
r
à all use random()
p y
o
à all display same repeatability for same seed

C
© MathByte Academy 637
y
Generating Random Integers e m
a d
randrange(stop)
A c
generates an integer number in
range(stop)
randrange(start, stop, step)
te
range(start, stop, step)

By
randint(a, b) th
à generates random int in [a, b]
a
M
à equivalent to randrange(a, b+1)
à syntax convenience

t ©
à uniform distribution
g h
r i
à call repeatedly to produce a sequence of random integers

p y
C o
© MathByte Academy 638
y
Generating Random Floats e m
a d
à random() à random float in [0.0, 1.0)
A c
à uniform distribution
te
B y
à uniform(a, b) à random float in [a, b]
th
a
à uniform distribution

M
©
à gauss(mu, sigma) à random /loat

h t à normal distribution

r i g à mean = mu, std deviation = sigma

p y
… and more - see the online docs

C o https://docs.python.org/3/library/random.html

© MathByte Academy 639


y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 640
y
e m
a d
Sampling and ShufVling A c
te
B y
th
a
M
t ©
h
ir g
p y
C o
© MathByte Academy 641
y
Shuf@ling e m
a d
à in-place shuf/le of items in a mutable sequence
A c
te
l = [1, 2, 3]

By
shuffle(l)
th
a
l à [3, 1, 2]
M
à l was mutated

t ©
g h
r i
p y
C o
© MathByte Academy 642
y
Choosing a single random element e m
a d
à choice(seq)
A c
à chooses a single random element from seq
te
By
à seq can be any sequence type (even immutable)
à does not modify seq in any way th
a
l = [1, 2, 3, 4, 5] M
t ©
choice(l) à3

g h
choice(l)
r
à5
i uniform distribution
choice(l)
p yà3

o

© MathByte Academy C 643


y
Choosing multiple random elements at a time e m
a d
à choices(seq, k=…)
A c
à choose k random elements from some sequence seq (uniform distribution)

te
à with replacements

B y
à the same element may get picked more than once in each set of k elements

th
a
à returns result as a list of k elements

l = 1, 2, 3, 4, 5, 6 M l.choices(l, k=2) à [6, 5]

t © l.choices(l, k=2) à [1, 3]

h l.choices(l, k=2) à [2, 2]

ir g
p y
à k can be larger than sequence length
(guaranteed to have repeated elements!)

C o
© MathByte Academy 644
y
Sampling a Population e m
a d
à sample(population, k)
A c
e
à population can be a sequence or a set, and even a range object
t
y
à choose k random elements from some population (uniform distribution)
B
à without replacements
th
a
à the same element cannot be picked twice in each set of k elements

à random sampling M
à k is the sample size
t ©
g h
à returns result as a list of k elements

r i
p y
à k cannot exceed len(seq) à ValueError otherwise

C o
© MathByte Academy 645
y
Weighted Choices e m
a d
l = [1, 2, 3, 4, 5, 6, 7, 8]
keyword-only argument A c
choices(seq, k=3)
te
à a list of k random elements from l
By
à with replacement
th
à uniform distribution
a
M
à for each pick of an element to include in the k choices

©
à every element has the same probability of being picked
t
g h
r i
p y
C o
© MathByte Academy 646
y
Weighted Choices e m
a d
à but we can change those probabilities

A c
à by specifying a sequence of weights to assign to each element of the sequence

te
y
à if specified, len(weights) must equal len(sequence)

l = [1, 2, 3, 4, 5, 6, 7, 8]
h B
weights = [1, 1, 1, 1, 2, 1, 1, 1]
a t
choices(l, weights=weights, k=3) M
t
à at every pick of the k elements ©
g h
i
à 5 has two times chances of being picked than all other elements

y r
à weights can be /loats too

o p
à no longer a uniform distribution

© MathByte Academy C 647


y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 648
y
e m
Math and Staticstics c a d
A
Modules y te
h B
t

23
a
M
t ©
h
ir g
p y
C o
© MathByte Academy 649
y
e m
à already seen math module
a d
à look at a few more functions in that module
A c
te
à statistics module
By
à variety of simple stats
th
à means, variances a
à normal distributions M
t ©
g h
r i
p y
C o
© MathByte Academy 650
y
e m
a d
math Module A c
te
By
th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 651
y
e m
a d
factorial(n) à factorial function
A c
perm(n, k) à permutations
te
comb(n, k) à combinations
B y
gcd(a, b)
th
à greatest common divisor of integers a and b
a
fsum(iterable)
M
à /loating point sum, more accurate than sum()

prod(iterable, *, start=1)
t © à product of all elements in iterable

g h
r i
p y
C o
© MathByte Academy 652
y
e m
dist(p, q)
a d
à Euclidean distance between p and q (iterables)

A c
hypot(*coords) à Euclidean norm of vector with specified coordinates

te
sqrt à square root

By
th
exp(x) à exponent (e ** x)
a
log(x) M
à natural log (base e)

log10(x) à log base 10 t ©


g h
r i
e
p y
à Euler's constant

C o
© MathByte Academy 653
y
e m
a d
c
degrees(x), radians(x) à degree/radian conversions

A
sin(x), cos(x), tan(x)
te
à trig functions

asin(x), acos(x), atan(x)


By
à arc functions

th
sinh(x), cosh(x), tanh(x) a à hyperbolic functions
M
asinh(x), acosh(x), atanh(x) à arc hyperbolic functions

t ©
pi
g h
r i
p y
C o
© MathByte Academy 654
y
e m
For full list of functions in math module:
a d
https://docs.python.org/3/library/math.html
A c
te
B
For complex number math, see cmath module y
th
https://docs.python.org/3/library/cmath.html
a
M
t ©
h
ir g
p y
C o
© MathByte Academy 655
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 656
y
e m
a d
statistics Module A c
te
B y
th
a
M
t ©
h
ir g
p y
C o
© MathByte Academy 657
y
Measures of Central Location e m
a d
à statistics module
A c
à s is a non-empty sequence or iterable
te
mean(s) y
à arithmetic average of an iterable
B
fmean(s)
th
à converts everything to float, then calculates mean (faster than mean)
a
median(s)
M
à median (may not be an element of the iterable)
median_low(s)
t ©
ensures median is member of the iterable
median_high(s)
g h
r i
mode(s)
p y
à applies to numeric or nominal data

C o
© MathByte Academy 658
y
Measures of Spread e m
a d
pstdev(s) à population standard deviation
A c
pvariance(s) à population variance
te
stdev(s)
By
à sample standard edviation

th
variance(s) à sample variance
a
M
©
quantiles(s, *, n=4, method='exclusive')
t
h
à n=4 for quartiles, n=100 for percentiles
g
r i
à method='exclusive' / 'inclusive'
y
à indicates if s is a sample that does/does not include most extreme
p
C o
population values

© MathByte Academy 659


y
Normal Distribution e m
a d
à NormalDist data type (class)
A c
à used to create and manipulate normal distributions of a random variable

te
d = NormalDist(mu=0.0, sigma=1.0)
B y
th
a
d.mean, d.median, d.mode, d.stdev, d.variance

d.pdf(x) M
à probability density function

t ©
h
d.cdf(x) à cumulative distribution function

r i g
d.inv_cdf(p) à inverse CDF (aka quantile function)

p
d.quantiles(n=4)
y à returns a list of n-1 cut points for the quantiles

C o
© MathByte Academy 660
y
Normal Distribution e m
a d
distributions
A c
d.overlap(other_normal_dist) à calculates area overlap of two

te
y
d.samples(n) à returns list of n random samples

h B
à supports arithmetic operations
+ or – with constants
a t
à translate distribution
* or / with constants M
à scales distribution

t ©
d = NormalDist(0, 1)

g h
d * 5 + 20
r i à NormalDist(20, 5)

p y
C o
© MathByte Academy 661
y
Normal Distribution e m
a d
à can also combine two normal distributions (+)
A c
te
y
d1 = NormalDist(1, 3) variance à 9.0
d2 = NormalDist(2, 4)
h B
variance à 16.0

a t
M
d1 + d2 à NormalDist(3, 5)

mean = sum of two means


t ©
1 + 2à3
g h
r i
variance = sum of two variances

p y
9 + 16 à 25 à std dev = 5

C o
© MathByte Academy 662
y
More functionality… e m
a d
à statistics module has more functionality
A c
te
By
h
https://docs.python.org/3/library/statistics.html#module-statistics

a t
M
t ©
g h
r i
p y
C o
© MathByte Academy 663
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 664
y
e m
The decimal Module c a d
A
te
B y
th

24
a
M
t ©
h
ir g
p y
C o
© MathByte Academy 665
y
e m
à we have seen that /loats do not have exact representations
a d
à most of the time that's not an issue
A c
à often deal with transforming data
te
y
à slight loss of precision, rounding errors, matter

B
h
à but level of /loat precision is suf/icient

a t
M
à you may have cases where the loss of precision is unacceptable

©
à you have to store decimal numbers exactly
t
h
à addition, subtraction, multiplication have to be exact

g
i
à division is going to suffer from rounding errors
r
y
1 / 3 = 0.333… à cannot store in/inite decimal numbers
p
C o à but what precision?

© MathByte Academy 666


y
e m
à Decimal objects can store decimal numbers exactly
a d
à but at what cost?
A c
e
à literals have to use strings to represent numbers - unwieldy
t
y
à cannot use most math functions (they convert to /loats)
B
h
à many specialized math functions are de/ined in Decimal
t
a
à arithmetic operations are slower than floats
M
à they use more memory than /loats

t ©
h
https://docs.python.org/3/library/decimal.html
g
r i
y
à implements IBM General Decimal Arithmetic Specification standard
p
o
http://speleotrove.com/decimal/decarith.html

C
© MathByte Academy 667
y
e m
a d
Decimal Objects A c
te
By
th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 668
y
The Decimal data type e m
a d
à decimal module
c
à Decimal data type (class)
A
Decimal(3)
te
y
take the integer 3 and convert it to a Decimal object
B
Decimal(0.1)
th
a
take the /loat 0.1 and convert it to a Decimal object

M
©
do you see the problem here?
t
h
0.1 is a float à it is already inexact, before we even pass it to Decimal

ir g take the string 0.1 and convert it to a Decimal object

p y
Decimal('0.1')
0.1 will be stored exactly as 0.1 in the Decimal type

C o
© MathByte Academy 669
y
e m
0.1 + 0.1 + 0.1 == 0.3 à False
a d
A c
Decimal('0.1') + Decimal('0.1') + Decimal('0.1') == Decimal('0.3')

te
à True
y
ok to use integers (not as strings)
B
Decimal(1) / Decimal(3)
th
à integers have exact representations

a
M
à Decimal('0.3333333333333333333333333333')

what precision?
t ©
g h
i
à default is 28 signi/icant digits
r
y
à we can override this value
p
C o
© MathByte Academy 670
y
Significant Digits e m
a d
à number of digits needed to represent the decimal number
A c
à leading zeros are ignored 001.2345
te à 5 signi/icant digits
à trailing zeros are not! 1.2000
By à 5 signi/icant digits

th
a
à important to understand how this affects arithmetic operations

Decimal('0.15') * Decimal(2) M à Decimal('0.30')

t © (not 0.3)

g h
r i
Decimal('0.100') * Decimal('0.200')

p y à Decimal('0.020000')

C o
© MathByte Academy 671
y
Rounding e m
a d
à can use the round() function
A c
à it will use a special rounding method de/ined by Decimal objects

te
round(Decimal('1.2335'), 3)
B y
à Decimal('1.234')
round(Decimal('1.2345'), 3)
th à Decimal('1.234')
a
à Banker's rounding M
(round to closest, ties to closest even)
à default
t ©
g h
à we can specify other types of rounding

r i
p y
C o
© MathByte Academy 672
y
Arithmetic Contexts e m
a d
c
à when we perform arithmetic operations on Decimal numbers
A
à precision can affect results
te
à rounding methodology can affect results
By
th
Example: suppose we are using a precision of 5, and Banker's rounding
a
M
d1 = Decimal('1.2325')
d2 = Decimal('122') 1.2325 * 122 = 150.3650

t ©
d1 * d2

g h
à Decimal('150.36')

i
à only 5 signi/icant numbers – so had to round to two decimals
r
p y à used Banker's rounding à 150.36

C o
© MathByte Academy 673
y
Arithmetic Contexts e m
a d
à view your current context settings
A c
decimal.getcontext()
te
à prec = 28
By
h
à rounding = ROUND_HALF_EVEN (Banker's rounding)

a t
M
à later we'll see how to modify the arithmetic context

IMPORTANT
t ©
h
à precision of a de/ined Decimal number is independent of context precision
g
r i
Decimal('1.23456789') will be stored exactly, even if context precision is 5

p y
à calculations, however, will use the context precision

C o
© MathByte Academy 674
y
Mathematical Functions e m
a d
à standard arithmetic operators and functions:
+, -, *, /, //, %, ** A
round, abs, min, max, sum
c
te
à careful with math module
you can use…
B y
à Decimals get converted to floats /irst

th
a
à Decimal objects implement some math functions:
M
©
d = Decimal('…') à d.exp()

h t à d.sqrt()

r i g à d.ln()
à d.log10()
+ more… p y
o
https://docs.python.org/3/library/decimal.html#module-decimal

© MathByte Academy C 675


y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 676
y
e m
a d
Arithmetic ContextsA c
te
B y
th
a
M
t ©
h
ir g
p y
C o
© MathByte Academy 677
y
Arithmetic Context e m
a d
à precision of intermediate calculations A c
à arithmetic contexts used in decimal calculations to de/ine many things

à rounding algorithm
te
decimal.getcontext()
B y
à returns the current context information

th
prec
a
à precision (defaults to 28)
rounding
M
à the rounding algorithm (default ROUND_HALF_EVEN)
and more…
t ©
g h
à we can change those de/initions

r i
à globally
p y
o
àtemporarily just for a section of code (using a context manager)

C
© MathByte Academy 678
y
Rounding Methods e m
a d
c
https://docs.python.org/3/library/decimal.html#rounding-modes
A
à default is ROUND_HALF_EVEN
te
y
à rounds to nearest, with ties to nearest even integer
B
0.135 à 0.14
th
0.145 à 0.14
a
M
©
à but can define other rounding methods

t
h
ROUND_HALF_UP

i g
à rounds to nearest with ties away from zero
r
p y
0.135 à 0.14 0.145 à 0.15

C o
© MathByte Academy 679
y
Global Context Changes e m
a d
à can modify prec and rounding in the global context
A c
àcontext settings persist for the remainder of the program
te
ctx = decimal.getcontext()
ctx.prec = 5 By
ctx.rounding = decimal.ROUND_HALF_UP th
a
IMPORTANT M
there seems to be an open bug in Jupyter's IPython kernel

t ©
setting the global context settings gets reset in next cell

g h
i
à temporary workaround until bug is fixed

y r use this as your /irst cell in notebook:

o p !jupyter notebook --version

© MathByte Academy C 680


y
Temporarily Changing Context Settings e m
a d
sometimes we want to temporarily change the context
perform some operations using that context A c
te
revert the context to its previous state

By
à could change the global context
th
ctx = getcontext()
a current_prec = ctx.prec

M ctx.prec = new_prec

t ©
g h # perform operations

r i
y
ctx.prec = current_prec

à cumbersome
o p à may even forget to switch back

© MathByte Academy C 681


y
Using a Context Manager e m
a d
à much easier (and safer) to use a context manager
A c
e
create context and enter context manager
t
with decimal.localcontext() as ctx:
B y modify the local context

ctx.rounding = decimal.ROUND_HALF_UP
th
print(round(Decimal('1.12345'), 4)
a à Decimal('1.1235')

M
©
after exiting context manager, global context is automatically restored
t
g h
print(round(Decimal('1.12345'), 4) à Decimal('1.1234')

r i
p y
C o
© MathByte Academy 682
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 683
y
e m
Custom Classes c a d
A
te
By
th

25
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 684
y
à everything in Python is an object e m
à has a type (aka class)
a d
à has state
à has functionality
A c
te
for example, [1, 2, 3, 4, 5] is an object
B y
à its type is list
th
à we say it is an instance of a list

a
à its state are the elements in the list
à functionality such as .append M
t © à two different objects
l1 = [1, 2, 3]

g h à both instances of the list type

r i
l2 = ['a', 'b', 'c']
à but different state

p y
l2.append('d') à affects l2, not l1

C o
© MathByte Academy 685
y
Methods and Bindings e m
a d
à why does l2.append('d') not affect l1?
A c
e
à append is a function that works on a speci/ic instance of the class
t
y
à append is called a method of the list class
B
h
à when we call the append method: l2.append('d')
t
a
à the method is bound to the object l2

M
à basically it will operate on l2

t ©
h
à in general append will operate on whatever list object is speci/ied before the dot
g
l1.append(10)
r i
p y
l2.append('c')

C o
© MathByte Academy 686
y
Custom Classes e m
a d
à we can de/ine our own custom types (classes)
A c
e
à instances of those classes will have
à a type (the custom type we created)
y t
B
à some state (we can store values specific to the instance)
h
t
à functionality (methods that are functions bound to the instance)
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 687
y
Initializers e m
a d
à when we create a new instance of a class
à often want to create some initial state A c
te
à usually by passing arguments to the "creation" phase

à this is called the initialization phase


By
th
a
M
à creation process is started by calling the class (type)

a = tuple([1, 2, 3])
t ©
h
à we are calling the tuple class (using ())
g
r i
à passing it an argument: [1, 2, 3]

p y
à call returns a new tuple instance, initialized with the elements 1, 2, 3

C o
© MathByte Academy 688
y
à every object creation follows this basic principle
e m
a d
reader = csv.reader(f, dialect=custom_dialect)

A c
à create an instance of the csv.reader class by calling it

te
à pass some arguments used for initialization (/ile and dialect)

By
à call returns an initialized new instance of reader

th
d = Decimal('1.2345') a
M
à create an instance of the Decimal class by calling it

©
à pass some arguments used for initialization (number string)
t
h
à call returns an initialized new instance of Decimal
g
r i
p y
C o
© MathByte Academy 689
y
Classes as Blueprints e m
a d
A c
à classes are often referred to as blueprints for creating objects
à a single class can be used to create many instances of that class
à each instance will have it's own state
te
By
à the functions de/ined in the class become methods bound to the instance
because these functions are bound to the instance

th
à they can access the state of the instance

suppose we have a Person class de/ined a


M
à we wrote our class so that the initializer requires /irst and last names
©
john = Person('John', 'Cleese')
t
h
eric = Person('Eric', 'Idle')

g
i
à we implemented a greet() method to say hello
r
p y
john.greet() à 'John says hi!'
eric.greet() à 'Eric says hi!'

C o
© MathByte Academy 690
y
Creating Custom Classes in Python e m
a d
à use the class keyword
A c
class Person:
te
'''A simple Person class'''
à code above is as simple a class as can be By
th
a
à but Python "injects" a lot of functionality into that class for us
à it is callable
M
p = Person()

©
à this created a new instance of Person
t
h
à Person and p have some state Python de/ined for us
g
r i
Person.__doc__ à 'A simple Person class'
y
Person.__name__ à 'Person'
p
C o
type(p) à Person and more…

© MathByte Academy 691


y
e m
a d
DeVining Classes A c
te
By
th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 692
y
e m
à classes are like templates for creating objects
a d
à objects have state and functionality
A c
e
à we can define what the state and functionality is using a class
t
y
à every instance of that class will have that functionality
B
t
à but every instance has its own state
h
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 693
y
e m
class à Circle
a d
à state: radius
A c
à functionality: area(), perimeter()
te
By
circle_1 à Circle(radius = 1)
th
circle_2 à Circle(radius = 2) a
M
à two different circles

t ©
à each one has its own value for radius

h
à but formula to calculate area and perimeter can be common
g
r i
à it just needs access to the instance value for radius

p y
C o
© MathByte Academy 694
y
e m
à to define a class we use the class keyword

a d
class Circle:
A c
de/inition of class is indented
te
By
th
à one (optional) part of the de/inition of a class is a docstring
a
M
à basically documentation of the class

class Circle:
t ©
h
"""This class can be used to represent a circle
g
i
and calculate area and perimeter
r
y
"""

p
à this is a valid Python class
o
© MathByte Academy C 695
y
class Circle: e m
"""docs for class"""
a d
à class does not do much
A c
te
à but it still has quite a bit of functionality built in for us by Python

By
Circle.__name__ à 'Circle'
th
Circle.__doc__ à 'docs for class' a
M
Circle.__class__ à Circle
t ©
h
Circle.__class__ is Circle à True
g
r i
p y
à Python also makes the class callable

C o
© MathByte Academy 696
y
e m
à Circle can be called to create new instances of that class

a d
c1 = Circle()
A c
à two different instances of Circle
c2 = Circle()
te
c1 is c2 à False

By
à the type of c1 and c2 is Circle th
a
type(c1) is Circle à True
M
à c1 is an instance of Circle
t ©
g h
i
isinstance(c1, Circle) à True

y r
o p
© MathByte Academy C 697
y
e m
à we can set attributes directly on the instances
a d
c1 = Circle()
A c
c1.radius = 10
te
c2 = Circle()
By
th
c2.radius = 20
a
M
©
à we can retrieve the attribute from each instance
print(c1.radius) à 10
h t
i
print(c2.radius) à 20
r g
p y
C o
© MathByte Academy 698
y
à we create instances of a class by calling the class e m
a d
à we can set/get attributes directly on the instances using dot notation

à to create and initialize a Circle instance A c


te
c1 = Circle()
c1.radius = 10
B y
th
a
à these attributes exist in the instance namespace à normally a dictionary
c1.__dict__ M
à {'radius': 10}

t © à sometimes the state is not in that dictionary

g h à but not in this course

r i
à but initializing the object state this way is cumbersome

p y
à we'll see a better way soon!

C o
© MathByte Academy 699
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 700
y
e m
a d
Initialization A c
te
By
th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 701
y
e m
à we've seen how to define custom classes
a d
c
à we call the custom class to create new instances of that class
A
e
à but can we provide initial values when the instance is created?
t
à we've seen this before!
By
d = Decimal('10.1') th
a
M
à creates a new Decimal instance
à initialized to 10.1
t ©
h
à the initial value was passed in the same call used to create the instance
g
r i
p y
C o
© MathByte Academy 702
y
à could mimic this initialization somewhat e m
a d
class Circle:
"""Circle class"""
A c
te
def create_circle(radius):
By
create the Circle instance (instantiation)
c = Circle()
th
c.radius = radius
a
set the instance radius (initialization)
return c
M
return the initialized instance

t ©
c1 = create_circle(10)
g h
r
type(c1) à Circlei
p y
c1.__dict__ à {'radius': 10}

C o
© MathByte Academy 703
y
Recall Methods e m
a d
l1 = list('abc')
A c
two different instances of a list
l2 = list('def')
te
l1.append('d')
B y
l2.append('g') th
same append function

a
à but operates on two different instances of a list

M
l1 à ['a', 'b', 'c', 'd']

©
l2 à ['d', 'e', 'f', 'g']
t
l1.append('d')
g h
à append is bound to l1
l2.append('g')
r i à append is bound to l2

p y
obj.func() à func is bound to obj, and is called a method

C o
© MathByte Academy 704
y
The __init__ Method e m
a d
A c
à the __init__ function is a special function that is called by Python
when we create a new instance of a class
te
class Circle:
By
def __init__(self):
th
print('__init__ called…')
a
M
Class creation: Circle() does two things

t ©
à creates a new instance of the class let's give it some name, new_obj

g h
i
à calls the __init__ function, passing new_obj as the /irst argument
r
y
à in that sense, __init__ is a method bound to new_obj
p
C o
© MathByte Academy 705
y
à __init__ is a function de/ined inside the class e m
a d
c
à but a function nonetheless

A
e
à we can de/ine additional parameters! à recall what we did here

y tdef create_circle(radius):

h B c = Circle()
c.radius = radius
t
class Circle:
def __init__(self, radius):
a return c
self.radius = radius
M
à same thing!

t ©
à note that the name self is not a special name – it is just convention

g h
à could name it something else

r i
y
à specify this additional parameter when we create the instance

p
C o
c = Circle(10) c.__dict__ à {'radius': 10}

© MathByte Academy 706


y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 707
y
e m
a d
Instance Methods A c
te
B y
th
a
M
t ©
h
ir g
p y
C o
© MathByte Academy 708
y
à create instances from classes by calling them e m
à use __init__ method to initialize instances
a d
à add value attributes using dot notation A c
te
à but how do we add functionality?
By
th
c = Circle(10)
a
c.area() M
à math.pi * r ** 2

t ©
h
à area needs to be a function in the class
g
r i
y
à bound to the instance when called with dot notation

p
C o
© MathByte Academy 709
y
à exactly the same as the __init__ function e m
a d
à define a function in the class

A c
e
à /irst argument will be the instance

y t
class Person:
def __init__(self, name):
h B
self.name = name
a t
def say_hello(self):
M
©
return f'Hello, {self.name}'

t
p = Person('Alex')
g h
r i
p.say_hello()
p y à Hello, Alex

C o
© MathByte Academy 710
y
e m
a d
à just like __init__ we can pass additional parameters to methods

class Person:
A c
def __init__(self, name):
te
self.name = name
By
def eat(self, food):
th
a
return f'{self.name} is eating {food.lower()}.'

M
p = Person('Alex')
t ©
g h
p.eat('Broccoli')
r i à Alex is eating broccoli.

p y
C o
© MathByte Academy 711
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 712
y
e m
a d
Special Methods A c
te
By
th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 713
y
e m
à already seen __init__
a d
à provides special behavior to our custom classes
A c
te
y
à there are many other such methods that provide special behavior
B
th
à they start and end with double underscores
à often referred to as dunder methods a
M
(so don't use this convention for your own method names!)

t ©
g h
r i
p y
C o
© MathByte Academy 714
y
Object String Representations
e m
l = [1, 2, 3] a string
a d
print(l) à '[1, 2, 3]'
A c
te
class Circle:
By
def __init__(self, r):
th
self.radius = r
a
c = Circle(10) M
print(c) à t ©
h
<__main__.Circle object at 0x7fc2703b4b20>

r i g
à Python's default string representation of our custom objects

p y
C o
© MathByte Academy 715
y
à can override this default behavior e m
à via special dunder methods
a d
à __str__
A c
à __repr__
te
str(c) à will call c.__str__()
By
repr(c) à will call c.__repr__()
th
a
à why two methods?
M
t ©
__str__ is used for string representation for users
__repr__ is used for string representations for developers (more details usually)

g h
r i
à print(c) uses __str__ if present
y
à otherwise __repr__
p
o
à otherwise default (class name & object id)

C
© MathByte Academy 716
y
Object Equality e m
a d
l1 = [1, 2, 3]
not the same objects
l2 = [1, 2, 3]
l1 is l2 à False A c
but they are equal l1 == l2 à True
te
By
th
class Person:
a
M
def __init__(self, name):
self.name = name

p1 = Person('Alex') t ©p2 = Person('Alex')

g h
r i
not the same objects p1 is p2 à False

p1 == p2
p y à False

C o
© MathByte Academy 717
y
à we can override equality de/inition for our custom objects e m
a d
à __eq__ method

A c
class Person:
te
def __init__(self, name):
self.name = name
By
th
def __eq__(self, other):
a
M
return self.name == other.name

p1 = Person('Alex')
t © p2 = Person('Alex')

g h
p1 == p2
r i
à p1.__eq__(p2) in general a == b

p y
à True à a.__eq__(b)

C o
© MathByte Academy 718
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 719
y
e m
a d
Properties A c
te
By
th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 720
y
à we have seen to de/ine custom classes and how to
e m
à de/ine instance methods
a d
à get/set attributes directly on the instance
c.radius = 10
A c
self.radius = 10
te
sometimes called "bare"

class Person:
By
attributes

def __init__(self, name):


th
a
self.name = name

def say_hello(self): M
©
return f'Hello, my name is {self.name}'
t
alex = Person('Alex')
g h
alex.say_hello()
r i
à Hello,my name is Alex
alex.name = 'Eric'
p y
C o
alex.say_hello() à Hello,my name is Eric

© MathByte Academy 721


y
e m
à we have been accessing these attribute values directly
a d
à we have no control over what the assigned values are
A c
à we have no control on formatting or modifying attribute when it is read
te
à sometimes we do!
By
th
a
we can control things in the __init__ when the instance is created

M
class Sale:
def __init__(self, quantity):
t ©
h
if not isinstance(quantity, int):
g
i
raise ValueError('Must be an int')
r
y
self.quantity = quantity

p
C o
© MathByte Academy 722
y
e m
à cannot control how it is set subsequently

a d
class Sale:
def __init__(self, quantity): A c
if not isinstance(quantity, int):
te
raise ValueError('Must be an int')
By
h
self.quantity = quantity

a t
M
s = Sale(10)
s.quantity = "zero"

t © this works!

g h
r i
p y
C o
© MathByte Academy 723
y
Properties e m
a d
a property is like an attribute, but
A c
à the value is set via a method (setter)
te
à the value is retrieved via a method (getter)
By
th
if name is a property in the Person class, and p is an instance
a
p.name = 'Alex'
M
©
à calls the setter method for name, passing 'Alex'
t
g h
print(p.name)
r i
y
à calls the getter method for name, returning a value
p
C o
© MathByte Academy 724
y
Read-Only Properties e m
a d
à can create read-only properties
A c
te
à de/ine a getter method
à but don't de/ine a setter By
th
a
M
(write-only properties are possible, but not common, and
a little harder to achieve)
t ©
g h
r i
p y
C o
© MathByte Academy 725
y
Creating a Read-Only Property e m
a d
à define a method, with the name of the property
A c
à decorate the method with @property
te
class Math:
By
@property
th
this is a getter method
def pi(self):
a
return 3.14
M
m = Math() t ©
g h
m.pi
r i
à calls the method pi(), bound to m (e.g. m.pi())

p y
C o
© MathByte Academy 726
y
e m
class Person:
def __init__(self, name):
a d
self._name = name
A c
notice the underscore
te
à convention

B y
à signifies _name is a private attribute to the class

th
à people using this class should not modify it directly
a
@property
def name(self): M
return self._name
t ©
g h
r i
p y
C o
© MathByte Academy 727
y
Read/Write Property e m
a d
à first define a getter à then de/ine the setter
A c
class Person:
te
def __init__(self, name):
self._name = name By
th
@property a all these property
def name(self):
M names must be the

©
return self._name same

@name.setter
h t
i
def name(self, value):
r g
y
self._name = value

p
C o
© MathByte Academy 728
y
Calculated Properties e m
a d
à properties are very general
à they are just methods A c
te
By
à they do not have to be used just to return an attribute
à they can just calculate and return some value
th
class Person:
a
def __init__(self, dob):
self.dob = dob M
t ©
@property
g h
def age(self):
r i
y
age = <calc current age>

p
return age

o
© MathByte Academy C 729
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 730
y
e m
3rdParty Libraries c a d
A
te
B y
th

26
a
M
t ©
h
ir g
p y
C o
© MathByte Academy 731
y
e m
a d
In this and the next sections we are going to cover some popular 3rd party libraries

A c
e
à there are thousands of 3rd party libraries
à so this is just a tiny subset
y t
h B
a t
à those libraries can have a ton of functionality

M
à we can only scratch the surface in a course such as this

t ©
h
à but, you will have all the tools and knowledge you need to research further
g
à read the docs
r i
p y
à read blog posts and see what other libraries are popular for your needs

C o
© MathByte Academy 732
y
à dealing with time zones and DST e m
d
pytz

c a
dateutil
A
à provides an "intelligent" datetime string parser

te
requests
By
à used to query web servers and web APIs (over http(s))

th
numpy a
à highly ef/icient implementations for array
M
processing and math computations

pandas
t ©
à used for data manipulation and analysis

g h
r i
matplotlib
p y à used for creating plots and charts

C o
© MathByte Academy 733
y
e m
à these are 3rd party libraries
a d
à they need to be installed
A c
te
pip install
By
th
à we already installed them at the very start of this course
a
M
à but you can also pip install them individually

t ©
à you need to know the package name

g h
r i
à library docs will have that information

p y
C o
© MathByte Academy 734
y
e m
à create a virtual environment
a d
python3 –m venv env_name
A c
Linux vs Windows
py –m venv env_name
te
à activate virtual environment
By
th
source env_name/bin/activate
a Linux vs Windows
.\env_name\Scripts\activate
M
t ©
à install the library into the virtual environment

g h
pip install pytz
r i
p y
C o
© MathByte Academy 735
y
e m
a d
The pytz Library A c
te
B y
th
a
M
t ©
h
ir g
p y
C o
© MathByte Academy 736
y
e m
à used for dealing with time zones
a d
à implements the Olson (or IANA) database
A c
à supports DST (daylight savings times)
te
By
à uniform naming convention
th
US/Eastern
a
M
America/New_York

t ©
Europe/Paris

g h
r i à Area / Location

p y à goes back to 1970 (Unix epoch)

C o
© MathByte Academy 737
y
e m
à https://pythonhosted.org/pytz/

a d
à pip install pytz
A c
import pytz
te
B y
pytz.all_timezones
h
à returns a list of all named time zones
t
a
à internally uses Python's tzinfo M
t ©
à but with some extras used for DST

g h
i
à a pytz timezone can be used instead of a tzinfo object
r
p y
C o
© MathByte Academy 738
y
Looking up a Time Zone e m
a d
à can retrieve a time zone from its name
A c
pytz.timezone('US/Eastern')
te
By
pytz.timezone('UTC')
th
à pytz.UTC a
M
©
à can use these time zones instead of Python's tzinfo

t
datetime(
g h
i
2020, 5, 15, 10, 0, 0,
r
y
tzinfo=pytz.timezone('US/Eastern')
)

o p
© MathByte Academy C 739
y
Making a naïve datetime aware e m
a d
à use pytz time zone's localize method
A c
tz_ny = pytz.timezone('America/New_York')
te
tz_ny.localize(naive_dt)
By
th
à pytz will /igure out if it needs to use DST or not!
a
M
à this just attaches the time zone information to the naïve datetime

©
à it does not "convert" the datetime to the new timezone
t
h
i.e. it assumes the datetime was given in the timezone that is
g
i
being attached
r
p y
C o
© MathByte Academy 740
y
Converting aware datetimes to other time zonese m
a d
c
à once we have an aware datetime we can convert it to another timezone
A
te
à use the astimezone method of the datetime object

B y
à but because we are using pytz timezone objects,
th
conversions work fine, including DST calculations
a
M
à if we start with a naı̈ve UTC time, we can directly transform it to a
speci/ic timezone
t ©
h
ir g
<py_tz_timezone>.fromutc(<naïve datetime>)

p y
C o
© MathByte Academy 741
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 742
y
e m
a d
The dateutil Library A c
te
B y
th
a
M
t ©
h
ir g
p y
C o
© MathByte Academy 743
y
à https://dateutil.readthedocs.io/en/stable/ e m
a d
à pip install python-dateutil
A c
à parser
te
y
à ability to automatically parse dates and times from string in various formats
B
h
à this is what we'll look at in this course
t
a
M
but it has a lot more…
à computing dates based on advanced recurrence formulas

t ©
h
à generate sequence of dates weekly on Tuesday and Thursday for 5 weeks

r i g
à generate sequence of dates every weekday for 3 months

p y
à very similar to what you might see when you set recurring

o
calendar meetings

C
© MathByte Academy 744
y
Basic Parsing Functionality e m
a d
from dateutil import parser
A c
parser.parse('2020-01-01T10:30:00')
te
parser.parse('2020-01-01 10:30:00 am')
By
th
a
parser.parse('12/31/2020')

parser.parse('31/12/2020') M
t ©
g h
r i
p y
C o
© MathByte Academy 745
y
Ambiguous Month/Day e m
a d
4/3/2020 2020/4/3
A c
à is this Month/Day or Day/Month?
te
à parser default assumes Month/Day
By
i.e. month is specified first
th
a
M
à can override this by using dayfirst keyword argument

parser.parse('2020/4/3')
t © à April 3, 2020

g h
i
parser.parse('2020/4/3', dayfirst=True) à March 4, 2020
r
p y
raises a ParserError exception if date is invalid or unrecognizable

C o
© MathByte Academy 746
y
Fuzzy Parsing e m
a d
à March the 4th, 2020 A c
à parser can even attempt parsing strings that contain extra information

te
à default parsing will not work

By
th
a
à use fuzzy_with_tokens=True argument when calling parse
à returns a tuple
M
(parsed datetime, ignored text elements)

©
à raises a ParserError exception if date is invalid or unrecognizable
t
g h
i
à it's quite good, but cannot handle just anything

y r
à May the fourth, 2020 is not recognized

o p
© MathByte Academy C 747
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 748
y
e m
a d
JSON Data A c
te
By
th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 749
y
e m
à JavaScript Object Notation
a d
A c
à it is a simple way of representing objects using just strings
à very easy to transmit strings te
à over a network, as a text file, etc
By
th
a
M
à JSON is a lightweight standard that we can use to

t ©
à encode an object into a string à serialization

g h
à decode a JSON string into an object à deserialization

r i
p y
à most often used when transmitting data over the web (e.g. REST APIs)

C o
© MathByte Academy 750
y
e m
à JSON is very simple
a d
A c
à easy for humans to read and write JSON
te
à easy for computers to parse and generate
B y
th
à it is a pure text format
a
à language independent M
(Python, C++, C#, JavaScript, Java, etc)

t ©
g h
r i
p y
C o
© MathByte Academy 751
y
consists of:
e m
object
a d
à unordered key:value pairs delimited by { } (dictionary)

array
A c
à ordered list elements separated by , and delimited by [ ] (list)

values
te
à numbers (integer or with decimal point)

B y
à strings, delimited by double quotes "…"
à boolean true or false th (note the lowercase!)
a
à null
M
(None)
à object
t ©
à so objects can contain other objects, arrays
à array
g h à arrays can contain other arrays, objects

r i
p y
à basically JSON looks like a Python dictionary!
à a JSON object has a single root object – everything else is nested within it

C o
© MathByte Academy 752
y
Example
it's a string
e m
'''
root is an object
a d
{
"firstName": "Eric",
key: value pairs
key must be a string A c
"lastName": "Smith",
te
strings must be double-quote delimited
"address": {
"country": "USA", By
value is another object
"state": "New York",
th
}, a
"age": 28,
M value is a list
"favoriteNumbers": [42, 3.14],
"likesSushi": false,
t ©
"driversLicense": null
g h
}
r i
'''

p y
Important: order of key:value pairs is irrelevant in JSON – don't count on it!

C o
© MathByte Academy 753
y
à white spaces (spaces, tabs, newlines) do not matter
e m
a d
'''
{
A c
"firstName": "Eric",
"lastName": "Smith"
te
}
By
'''
th
a
M
'''{"firstName":"Eric","lastName":"Smith"}'''

t ©
à but which is more human readable?

g h
r i
à note the stylistic difference: camelCase vs snake_case
y
à of course, they are just strings, so you can use whatever you want
p
C o
© MathByte Academy 754
y
Deserializing JSON (decoding) e m
a d
à Python standard library json module
A c
à json.loads(json_string)
te
By
à parses a json string and returns a dict object

th
a
Since JSON is a standard, Python's loads can handle any standard JSON object
M
t ©
g h
r i
p y
C o
© MathByte Academy 755
y
Serializing JSON (encoding) e m
a d
à json.dumps(dict)
A
à returns a JSON string c
à have to be more careful here
te
By
à basic JSON data types are very simple: int, float, str, bool, None

th
a
à Python has a far richer set of data types

M
à datetime, Decimal, custom classes, etc

t ©
à those are not serialized by default, and if we try, we'll get an exception

g h
i
à there is a way to specify custom encoders
r
p y
à beyond the scope of this course

C o
© MathByte Academy 756
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 757
y
e m
a d
REST APIs A c
te
By
th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 758
y
What is an API? e m
a d
API à Application Programming Interface
A c this is what we interact with

te black box

interface
Your
Software
h B
send/receive data
Software

t
Application
Application
a
M
à enables your application to interact with another application

t ©
h
à a Python class exposes an API à methods, properties

r i g

interface
p y
Python
Code
send/receive data
Class
Instance

C o
© MathByte Academy 759
y
à these days many applications are "in the cloud"
e m
à CRM à Payroll à trading platforms
a d
à Automated AI

à they expose an API available via the web using http(s)A c


te
à web sites
y
à request data using a URL
B
th
à this is called a GET request (fetches data)

a
M
t ©

server
Software

web
Browser

g h
GET https://site.com/page1
Application

r i
p y
C o
© MathByte Academy 760
y
How a browser retrieves a web page e m
a d
c
GET https :// mysite.com /path/to/page.html
method protocol domain A
resource path
verb
te
B y
URI (Uniform Resource Identi3ier)

URL (Uniform Resource Locator)


th
à also supports query arguments
a
M
à basically like named arguments in Python functions

©
GET https://mysite.com/currentTemp?city=Chicago&units=metric
t
h
à web server at mysite.com waits to receive these requests
g
r i
à browser sends request to web server
y
à server sends back data (often html, but does not have to be!)
p
o
à browser displays returned data

C
© MathByte Academy 761
y
Sending Data e m
a d
à can also send data to a web server
c
à e.g. user registration data
A
à different methods/verbs à e.g. POST
te
B y
à specific "path" on web server we need to send the data to (specific URL)

th
à data is attached when request is sent by browser
a
M
à web server receives this data and does something with it

©
à and usually returns a response of some kind
t
g h
r i
p y
C o
© MathByte Academy 762
y
In general… e m
a d
à web servers listen for incoming requests
A c
à request contains
te
à method GET, POST, …
By
à URL
th
à speci/ies exactly what we are trying to "access"
à query arguments (maybe) a
M
à "attached" data (maybe)

t ©
h
the set of what URLs, query arguments, methods and data a web server understands
g
r i
à is essentially an API

p y
à data is not necessarily HTML – can be JSON, XML, …

C o
© MathByte Academy 763
y
REST APIs e m
a d
à REST APIs are special types of APIs
A c
à REST has to do with how they are implemented and their behavior

te
à as users of the API we don't actually care if it's REST or something else!

By
th
à one of the fundamental characteristics of a REST API is that calls are
independent of each other (stateless)
a
M
à call to API does not rely on remembering how you interacted with it in the past

t ©
à not quite the same with web sites
à log in
g h
r i
à now you can access pages on the site

p y
à web server remembers who you are

C oà stateful

© MathByte Academy 764


y
Authentication / Authorization e m
a d
c
à REST APIs are generally secured
à you need to be authenticated
A
à web server needs to know who you are

te
à usually a secret token you pass in the request
à in something called headers
B y
th
à just an extra "bucket" of key-value data that can be

a
sent/received along with request

M
à you also need to be authorized to perform the request

t ©
à you may be authorized to read some data

h
à but you may not be authorized to create/delete that data
g
r i
p y
Authentication à establishes who you are to the system you are interacting with
Authorization à governs what you can/cannot do in system

C o
© MathByte Academy 765
y
API Data Formats e m
a d
à most modern APIs use JSON for sending/receiving data
à sometimes uses XML, or even proprietary formats A c
te
y
simple, easy to read

B
{
"firstName": "Davey",
th
more verbose, but also more powerful
a
"lastName": "Jones",
"ship": "Flying Dutchman",
"lastSeen": 2017, M
©
<?xml version="1.0" encoding="UTF-8"?>
"nationality": null
}
h t
<root>
<firstName>Davey</firstName>

r i g <lastName>Jones</lastName>

p y <lastSeen>2017</lastSeen>
<nationality/>

C o </root>

© MathByte Academy 766


y
Resources e m
a d
c
à REST APIs allow us to interact with entities, called resources
à bank account
A
à customer
à create new account
te à create new customer
à list accounts for speci/ic customer
à get balance
By à get customer info
à update customer info
à deposit, withdraw
th à delete customer
à delete the account a
M
©
GET https://.../customer/12345/account/5523?query=balance
t
h
à {"balance": 2123.45, "asOf": "2020-04-05T15:35:45+00:00"}
g
r i
POST https://.../customer/12345/account/5523

p y
+ {"action": "withdraw", "amount": 100.0}

o
à {"balance": 2023.45, "asOf": "2020-04-05T16:00:00+00:00"}
C
© MathByte Academy 767
y
API Methods e m
a d
c
à since humans design/write these APIs, things are not always consistent!
A
GET à retrieves resource(s)
te
à often used with query args
By
th
POST
a
à used to create a resource

M
à issuing the same POST request twice can end up creating two resources

PUT, PATCH
t ©
à usually used for updating an existing resource

g h
DELETE
r i
à delete a resource

p y
C o
© MathByte Academy 768
y
Status Codes e m
a d
à making an HTTP request (GET, POST, etc) always returns a status code
à plus whatever else the API speci/ies
A c
2xx à success
te
200
201
à OK
à Created
B y
request was successful
resource created successfully
202 à Accepted
th
request accepted, but not /inished processing (async)
4xx à you did something wrong a
400 à Bad Request M
server did not understand the request
401
t ©
à Unauthorized technically this means "not authenticated"
403 à Forbidden

g h
this means not authorized
404 à Not Found
r i server cannot /ind speci/ied resource

p y
5xx à Server had an issue à usually not your fault!

C o
à many more… https://en.wikipedia.org/wiki/List_of_HTTP_status_codes

© MathByte Academy 769


y
Finnhub Stock API e m
a d
à https://finnhub.io
A c
à provides free and paid tiers
te
à REST API
à uses JSON
By
à mostly GET requests
th
a
à you'll need to sign up for an account to follow along (free tier)
à this is the web M
à things change often!
t ©
h
à by the time you view these videos their APIs could have changed
g
r i
à but I'll show you how to read the documentation in case that happens

p y
à generally things stay backward compatible – clients get really annoyed otherwise!

C o
© MathByte Academy 770
y
e m
a d
A c
te
By
Codingth
a
M well, almost…

t ©
g h
r i
p y
C o
© MathByte Academy 771
y
e m
a d
The requests Library A c
te
B y
th
a
M
t ©
h
ir g
p y
C o
© MathByte Academy 772
y
e m
a d
à python has a module in the standard library for making http requests

A c
à slightly low level interface (think time vs datetime)

à 3rd party library Requests: HTTP for Humans


te
à pretty much standard
By
th
à even Python's own docs suggest using it!
a
M
©
https://requests.readthedocs.io/en/master/

t
pip install requests
g h
r i
p y
C o
© MathByte Academy 773
y
Making Requests e m
a d
requests.get(…)
A c
à all standard methods/verbs are implemented as functions

requests.post(…)
te
requests.put()
By
etc…
th
a
à common arguments
M
url
params
t ©
à the URL request will be sent to
à dictionary of query parameters (key = value)

g h
json
r i
à JSON sent in request (usually for POST, PUT, etc)
headers

p yà dictionary of headers (key = value)

C o
and many more…

© MathByte Academy 774


y
Receiving Responses e m
a d
à it has the following properties (amongst others): A c
à result of making a request (get, post, etc) is a Response object

te
status_code à e.g. 200, 403
By
reason
th
à e.g. OK, Forbidden
a
text
M
à content of the response

json
t ©
à returned deserialized JSON (if any) à so a dict

g h
à reading this property if no JSON is present raises a ValueError
headers
y ri à dictionary of headers received from server

cookies
o p à cookies received from server

© MathByte Academy C 775


y
Example: Google search results (HTML response) e m
search:
a d
à search terms: python http requests
A c
à number of results: 5
te
y
https://www.google.com/search?q=python+http+requests&num=5
B
th
a
à using requests library to retrieve the HTML search results
response = requests.get(
M
url='https://www.google.com/search',

t ©
params={'q': 'python http requests', 'num': 5}
)
g h
r i
response.status_code à 200

p y
response.reason à OK

o
response.text à HTML page browser would display

C
© MathByte Academy 776
y
e m
à calling an unde/ined URL
a d
response = requests.get(
A c
url='https://www.google.com/search2',
te
)
By
params={'q': 'python http requests', 'num': 5}

th
a
response.status_code à 404 M
response.reason à Not Found
t ©
g h
r i
p y
C o
© MathByte Academy 777
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 778
y
e m
NumPy c a d
A
te
By
th

27
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 779
y
e
à NumPy is a widely used library mainly used for working with arraysm
à very fast
a d
à very memory efficient
A c
à very /lexible
te
By
th
a
pip install numpy

https://numpy.org/ M
t ©
g h
r i
p y
C o
© MathByte Academy 780
y
What are arrays? e m
a d
à basically lists
à a Python list is a type of array
A c
à elements are indexed
te
à arr[0], arr[1], …

à array can be sliced By


à arr[start:stop:step]
th
à variable size a
à can add / remove elements from array

à heterogeneous M
à elements can have different data types

t ©
à a NumPy array (ndarray)
g h
à /ixed size
r i
p
à homogeneousy
C o
© MathByte Academy 781
y
Python list vs NumPy ndarray
e m
à these are some of the similarities and differences
a d
ndarray
A c
list
fixed size
te variable size
can be reshaped
By
homogeneous
th heterogeneous
elements have specialized, a elements are
restricted data types M Python objects
indexing arr[i]
t © lst[i]
slicing arr[a:b:c]
g h lst[a:b:c]

r i
y
masking arr[(arr > 2) & (arr < 10)]

p
o
fancy indexing arr[[0, 3, 4]]

C
© MathByte Academy 782
y
NumPy Efficiency e m
a d
à more space ef/icient than Python
A c
e
à array manipulation and calculations are much faster
t
à vectorization
By
à but at a cost
th
a
à /ixed size
M
à once created, cannot add/remove elements
à elements can be replaced

t ©
h
à homogeneous à all elements must be the same type

r i g
à even in multi dimensional arrays (arrays of arrays)
à data types
p y à it uses data types from underlying C language

C o à memory ef/iciency & vectorization

© MathByte Academy 783


y
Integer Sizes e m
a d
à integers are stored as sequences of bits (0s and 1s)
à number of bits determines how large the integer can be A c
te
4 bits largest number y
à (1111)2 = 20 + 21 + 22 + 23 = 15
B
à range is: [0, 15] (16 numbers)
th
a
M
à but may want negative numbers

©
à in that case, one bit is reserved to keep track of the sign
t
à 3 bits

g h à (111)2 = 20 + 21 + 22 = 7

r i
-7 -6 … -1 -0 +0 +1 +2 … +6 +7

p y à 0 does not need a sign à [-8, 7]

C o
© MathByte Academy 784
y
Integer Sizes e m
a d
à 8 bits [-128, 127]
A c
signed à 16 bits [-32_768, 32_767]
te
integers à 32 bits [-2_147_483_648, 2_147_483_647]
By
à 64 bits
th
[-9_223_372_036_854_775_808, 9_223_372_036_854_775_807]

a
à 8 bits [0, 255] M
à 16 bits [0, 65_535]
t ©
unsigned
g h
integers à 32 bits
i
[0, 4294967295]
r
à 64 bits
p y
[0, 18_446_744_073_709_551_615]

C o
© MathByte Academy 785
y
Floats e m
a d
à Python uses 64 bits to store floats
A c
à certain precision and size of exponent
te
By
à C also has 32-bit floats
th
à less precision, smaller exponent a
à but more ef/icient storage M
t ©
g h
r i
p y
C o
© MathByte Academy 786
y
NumPy Types e m
a d
à in NumPy you choose your data type
A c
à if you pick an unsigned 8-bit integer, you can only store numbers in [0, 255]

te
signed integers
y
à int8, int16, int32, int64
B
th
unsigned integers à uint8, uint16, uint32, uint64
a
floats à float32, float64 M
©
(float64 is compatible with Python float)
t
g h
i
complex à complex64, complex128
r
p y
(complex128 is compatible with Python complex)

o
https://numpy.org/doc/stable/user/basics.types.html

C
© MathByte Academy 787
y
Vectorization
e m
suppose we want to multiply every element of one array by the
corresponding element in another array
a d
a = [1, 2, 3, 4]
A c
à result = [10, 40, 90, 160]
b = [10, 20, 30, 40]
te
à loop result = []
By
h
for i in range(4):
result.append(a[i] * b[i])
a t
or [x * y for x, y in zip(a, b)]
M
at every loop, Python must:
à lookup the operand objects t ©
g h
i
à determine the types
r
y
à try to perform the operation (if a * b does not work, it tries b * a)
p
o
C does not have to do all that work à signi/icantly faster

C
© MathByte Academy 788
y
Vectorization e m
a d
NumPy implements things in such a way that
à given a and b are NumPy arrays (ndarray) A c
te
à given a supported function or operator

By
a + b à add(a, b)
th
a * b
a
à multiply(a, b)
a / b M
à divide(a, b)
sin(a) / sin(b)
t © à divide(sin(a), sin(b))

g h
i
à NumPy pushes the loop and calculations down into C
r
y
à this is called vectorization
p
o
à these functions are called universal functions (ufunc)

C
© MathByte Academy 789
y
Why are Arrays Important? e m
a d
à most data we deal with is represented as arrays
à often multi-dimensional arrays A c
te
y
à an image is a 2-dimensional array of colored pixels
B
th
à each pixel is an array, e.g. [red, green, blue, alpha]

a
M
à a video is an array of images (a bit oversimpli/ied)

à audio is encoded into arrays


t ©
g h
à stock quotes, tick data are arrays of data

r i
y
à an Excel spreadsheet is a (2-dimensional) array
p
C o
© MathByte Academy 790
y
NumPy is a huge library e m
a d
à lots of universal functions
A c
e
à financial, math, stats, linear algebra, sorting, sampling, Fourier
t
transforms (discrete) and more…

By
à we'll just look at a few of these
th
a
M
à also introductory look at array creation and manipulation

©
(indexing, slicing, fancy indexing, masking, reshaping)
t
g h
r i
https://numpy.org/doc/stable/

p y
à it also is the foundation for the Pandas library (dealing with data sets)

C o
© MathByte Academy 791
y
e m
a d
Creating Arrays from Lists A c
te
B y
th
a
M
t ©
h
ir g
p y
C o
© MathByte Academy 792
y
e m
à first thing is we have to import NumPy
a d
import numpy
A c
te
à typically everyone aliases it for less typing
import numpy as np
By
th
a
M
à the array type is np.ndarray

©
(n-dimensional array)

h t
r i g
p y
C o
© MathByte Academy 793
y
e m
a = np.array([1, 2, 3])

a d
type(a) à ndarray
A c
te
à but what type was used for the elements themselves?

By
à remember that in NumPy we use the C types, not the Python types
th
a
à also array is homogeneous, i.e. every element has same data type

M
a.dtype à int64

t ©
h
à NumPy analyzes the data and picks something appropriate
g
r i
à in this case a 64-bit integer

p y
à for /loats it defaults to 64-bit /loats

C o
© MathByte Academy 794
y
Specifying the element data type e m
a d
c
à we can override that default and select a speci/ic type
A
a = np.array([1, 2, 3], dtype=np.int8)
te
a.dtype à int8
By
Careful! th
a
M
à do not use a type that is too restrictive

©
à weird things happen when integer in list is too large for specified dtype

t
h
à /loats in a list will be truncated if dtype is set to an integer

g
r i
y
à why not just always use int64?

p
à memory ef/iciency for extremely large datasets
o
© MathByte Academy C 795
y
Multi-Dimensional Python Lists e m
a d
à in this course we'll stick to 2 dimensional arrays
l = [
A c
[1, 0, 0],
te
[0, 1, 0],
[0, 0, 1]
rows

B y
à also called axes
rows à axis 0
]
th columns à axis 1
columns a
M
à only using 2 dimensions is not particularly restrictive

t ©
h
time open high low close prev_close
1603249488

r i g100 102 98 102 100

y
1603249498 200 202 198 202 200

o p
1603249587 300 302 298 302 300

© MathByte Academy C 796


y
Converting Multi-Dimensional Lists to Arrays e m
a d
A c
à works exactly the same way as with 1-D arrays
à but again, remember that all the elements in the array must be of the same type
te
l = [
B y
[1, 0, 0],
[0, 1, 0],
th
[0, 0, 1] a
]
M
t ©
h
m = np.array(l) dtype à int64

ir g
m = np.array(l, dtype=np.uint8)

p y
C o
© MathByte Academy 797
y
Array Shape e m
a d
c
à shape of an array is number of elements in each dimension
A
[
te
[1, 2, 3],
[4, 5, 6]
à 2 dimensions
B y
h
à /irst dimension has 2 elements
]

a t
à second dimension has 3 elements

M
à (2, 3)

[1, 2, 3]
t ©
à 1 dimension

g h
à first dimension has 3 elements

y ri à (3, )

o p
à use the shape attribute of ndarray objects

© MathByte Academy C 798


y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 799
y
e m
a d
Creating Arrays from Scratch A c
te
B y
th
a
M
t ©
h
ir g
p y
C o
© MathByte Academy 800
y
e m
à seen how to create arrays from lists
a d
A c
à handy to convert lists of data loaded from a CSV file for example
à or retrieved via a web API
te
By
th
à sometimes we just need to generate specialized arrays
a
à could do it from a Python list M
t ©
h
à but NumPy has several convenient functions

g
r i
p y
C o
© MathByte Academy 801
y
Array of zeros e m
a d
np.zeros(size_or_shape, dtype) A c
te
By optionally specify data type
th defaults to float64
single number à 1-D array of that length a
tuple à shape (# rows, # columns) M
t ©
[0, 0, 0]

g h
[
[0, 0, 0],
r i [0, 0, 0]

p y ]

C o
© MathByte Academy 802
y
np.zeros à arrays filled with zeros
e m
a d
np.ones à arrays /illed with ones
A c
np.full
te
à arrays /illed with some speci/ied constant value

B y
à generates identity matrices
h
np.eye

np.arange a t
à generates 1-D array based on a range (start:stop:step)
M
np.linspace
t ©
à generates evenly spaced numbers between start/stop

g h
np.random.random
r i à arrays /illed with random /loats [0, 1)

p y
np.random.randint à arrays filled with random integers [low, high)

C o
© MathByte Academy 803
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 804
y
e m
a d
Reshaping Arrays A c
te
B y
th
a
M
t ©
h
ir g
p y
C o
© MathByte Academy 805
y
What is reshaping? e m
a d
[1, 2, 3, 4, 5, 6] shape à (6, )
A c
te
à using the same elements, we can rearrange them
B y
th
a
[ [ [
[1],
M
[1, 2, 3], [1, 2],
[3, 4], [2],
[4, 5, 6]
]
t © ]
[5, 6] [3],
[4],
(2, 3)
g h [5],

r i (3, 2)
]
[6]

p y (6, 1)

C o
© MathByte Academy 806
y
Reshaping Shares Elements e m
a d
c
à this is very important (and we'll see later this applies to slicing also)

A
arr = np.array([1, 2, 3, 4, 5, 6])
te
shap = arr.reshape(3, 2)
B y
th
a [1, 2, 3, 4, 5, 6]
[
M
©
[1, 2],

t
[3, 4], these are the same elements
[5, 6]

g h à if you change the value at [0, 0] in shap, it


i
]

r
will change the corresponding value in arr [0]

p y and vice versa

C o à in a sense, reshaping rearranges the "slots"

© MathByte Academy 807


y
Making a Copy e m
a d
à arr.copy()
A c
te
à this will make a copy of arr
By
th
à can use to break the tie between an array and the reshaped array
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 808
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 809
y
e m
a d
Stacking A c
te
By
th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 810
y
à concept is very straightforward e m
a
à we can stack arrays one on top of each other (vstack) d
à or we can stack then side by side (hstack)
A c
te
1 2 3
4 5 6
10 20 30
40 50 60
hstack
By 1 2 3 10 20 30
4 5 6 40 50 60
7 8 9 70 80 90
th 7 8 9 70 80 90

a
1 2 3 M
4 5 6
t © 1
4
2
5
3
6

g h
7 8 9 vstack 7 8 9

ri10 20 30 10 20 30

p y 40 50 60 40
70
50
80
60
90
o
70 80 90

© MathByte Academy C 811


y
Stacking e m
a d
à a1, a2, a3 are arrays
A c
np.vstack((a1, a2, a3))
te
à stack vertically

np.hstack((a1, a2, a3))


B y
à stack horizontally

th
a
M
argument is a tuple
t ©
g h
r i
p y
C o
© MathByte Academy 812
y
Shapes must be Compatible e m
a d
A c
à if stacking vertically, same number of columns for each array is required
à if stacking horizontally, same number of rows for each array is required

te
1 2 3
By 1 2 3

vstack
4 5 6
th 4 5 6
7 8 9
a 7 8 9 10
0 0 0
M 0 0 0

t © 10 20
1 2 10 20 30 0

g h 1 2
40 50
0
hstack 3 4 40 50 60 0
i
3 4 0

y r
5 6 70 80 90 0 5 6
70
90
80
99
0

o p
© MathByte Academy C 813
y
What happens to dtype? e m
a d
à can stack arrays with different dtype
à NumPy will determine a suitable common data type A c
te
à we cannot control that

By
th
à stacking uint8, uint16 and int64
a
M
à NumPy picks a float64 for the stacked array

t ©
h
in a future version of NumPy (1.20), it will be possible to specify the
g
r i
data type when using the concatenate function – which is a more
y
generic form of vstack and hstack
p
C o
© MathByte Academy 814
y
Casting an Array to another Data Type e m
a d
arrays we are stacking to a common type
A c
à we can however control the stacked data type by /irst converting the

te
à use the astype method on an array
By
arr1.astype(np.int64)
th
a
M
à so we could use this to stack multiple arrays
np.vstack(
[ t ©
g h
arr1.astype(np.int64),

r i
arr2.astype(np.int64)
]
p y
o
)

© MathByte Academy C 815


y
e m
Stacked Arrays are Independent of Original Arrays
a d
A c
à we saw that a reshaped array is "linked" to the original array

te
à this is not the case for stacked arrays

B y
h
à modifying an element in the stack does not modify original array
t
a
à modifying element in original array does not modify the stack
M
t ©
h
ir g
p y
C o
© MathByte Academy 816
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 817
y
e m
a d
Indexing A c
te
By
th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 818
y
Python Sequence Types e m
a d
à recall Python sequence types such as lists and tuples
A c
à elements are positionally indexed te
0, 1, 2, …

By
h
à get element at index i lst[i]

à replace element at index i


a t
lst[i] = x

M
à indexing 2-D lists (a list of lists) works the same

arr = [ [1, 2], [3, 4] ] t ©


g h
arr[0][1] à 2
r i
p y
arr[0][0] = 100 arr à [ [100, 2], [3, 4] ]

C o
© MathByte Academy 819
y
Indexing NumPy Arrays e m
a d
à very similar to Python sequence types
arr = np.arange(1, 7).reshape((2, 3))
A c à
[
[1, 2, 3],

e
[4, 5, 6]
arr[0][0] à 1
y t ]
arr[1][2] à 6
h B
a t
à with NumPy arrays instead of [i][j], we can use [(i, j)]

arr[0][0] arr[0, 0] M
arr[1][2]
t ©
arr[1, 2]
a tuple, so we can omit the ()

g h
à for 1-D array
r i
y
arr = np.arange(1, 7)
p
arr[1] à 2

C o arr[(1,)] à 2

© MathByte Academy 820


y
Mutating Elements e m
a d
à works the same as Python lists
A c
arr = np.arange(1, 7)
te
By
h
arr[2] = 30 arr à [1, 2, 30, 4, 5, 6]

a t [
M
arr = np.arange(1, 7).reshape((2, 3))
[1, 2, 3],
©
à
[4, 5, 6]

h t [
]

ri
arr[1, 2] = 60 à g [1, 2, 3],

p y ]
[4, 5, 60]

C o BEWARE: data types!

© MathByte Academy 821


y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 822
y
e m
a d
Slicing A c
te
By
th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 823
y
Slicing Python Sequences e m
a d
l = [1, 2, 3, 4, 5]
A c
te
l[0:3] à [1, 2, 3]

By
th
à slicing returns a new, independent, list
a
slice_ = l[0:3] M
slice_[1] = 20
t ©
slice_ à [1, 20, 3]

g h
r i l à [1, 2, 3, 4, 5]

p y
C o
© MathByte Academy 824
y
Slicing Python 2-D Sequences e m
a d
m = [
[1, 2, 3],
A c
e
[4, 5, 6],

]
[7, 8, 9]
y t
h B
à want to slice in two axes
a t
m[0:2] à [
M
t ©
[1, 2, 3],
[4, 5, 6]

g
]
h
r i [

y
[2, 3],
à cannot just use a slice to isolate

o p ]
[5, 6]

© MathByte Academy C 825


y
Python Sequence Slice Assignments e m
a d
c
à we can mutate a Python list by using the assignment operator with a slice definition
A
l = [1, 2, 3, 4, 5]
te
l[0:3] = [10, 20, 30]
By
l à [10, 20, 30, 4, 5]

th
a
M
à since Python lists are not /ixed size, we can also replace the slice with more or less
elements
t ©
l = [1, 2, 3, 4, 5]
g h
r i
l[0:2] = [10, 20, 30, 40] l à [10, 20, 30, 40, 3, 4, 5]

p y
C o
© MathByte Academy 826
y
Slicing 1-D NumPy Arrays e m
a d
à very similar to slicing lists
A c
arr = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8])
te
arr[0:3] à [0, 1, 2]
By
ndarray (not list)

th
a
à step, negative indexing, etc are all supported, just like list slicing
M
arr[2:6:2] à [2, 4]

t ©
arr[1::2] à [1, 3, 5, 7]

g h
r i
arr[::-1] à [8, 7, 6, 5, 4, 3, 2, 1, 0]

p y
C o
© MathByte Academy 827
y
Slicing 2-D Arrays e m
a d
à NumPy provides support for slicing along multiple axes

A c
e
axis 1

0
0
1 2 3
1 2
slice 0:2 along axis 0 y t
axis 0 1 4 5 6
h B
2 7 8 9
a t
and slice 1:3 along axis 1

arr[0:2, 1:3] M
t ©
g h axis 1 slice
axis 0 slice
r i
p y
à can also write this as arr[:2, 1:]

C o
© MathByte Academy 828
y
Slicing 2-D Arrays e m
a d
à can get even fancier when using steps
A c
axis 1

te
1 2 3 4 5
B y
à can think of this as the intersection of
6 7 8 9 10
thà rows 0, 2, 4 à [::2]

11 12 13 14 15 a à columns 1, 3 à [1::2]
M
axis 0

16 17 18 19 20
21 22 23 24
t
25© arr[::2, 1::2]

g h
r i
p y
C o
© MathByte Academy 829
y
Slice Assignment in NumPy Arrays e m
a d
à works very similarly to assigning to list slices
A c
te
à cannot replace with an array that is not the same shape

By
à also means we cannot change size of the original array

th
à makes sense since NumPy arrays are fixed size
a
à be careful with data types!
M
a = np.array([1, 2, 3, 4, 5])
t ©
g h
a[0:3] = np.array([10, 20, 30]) a à [10, 20, 30, 4, 5]
r i
p y
à can also replace with a list or tuple – NumPy will handle it

C o
© MathByte Academy 830
y
Slice Assignment in NumPy Arrays e m
a d
A c
à can also assign a single value (not an array) to a slice

te
B
times as necessary (this is called broadcasting)y
à NumPy basically fills the slice with the same value repeated as many

th
arr = np.array([1, 2, 3, 4, 5, 6, 7]) a
M
arr[::3] à [1, 4, 7]
t ©
arr[::3] = 0
h
arr à [0, 2, 3, 0, 5, 6, 0]
g
r i
p y
C o
© MathByte Academy 831
y
Slices are "linked" to Original Array e m
a d
à similar to reshape we saw earlier
A c
te
à a slice is "linked" to the array it was sliced from

By
arr = np.array([1, 2, 3, 4, 5])
th
s = arr[0:3] à [1, 2, 3] a
M
©
à replacing an element in s will be "seen" by arr
t
à and vice versa
g h
r i
à to avoid this, make a copy of the slice

p y
s = arr[0:3].copy()

C o
© MathByte Academy 832
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 833
y
e m
a d
Fancy Indexing A c
te
By
th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 834
y
e m
d
à we saw how to use single index values to specify an array item
a
1-D à arr[3]
A c
2-D à arr[2, 5]
te
à we saw how to use slicing
By
1-D à arr[1:3:2]
th
2-D à arr[1:3:2, :5] a
M
t ©
à single items at a time

g h
i
à items that can be de/ined using slicing
r
y
à sometimes not enough – what if we want items (or rows) 1, 2 and 4?
p
C o
© MathByte Academy 835
y
One way… e m
a d
arr = np.array([1, 2, 3, 4, 5, 6])
A c
te
à want an array consisting of elements at indices 0, 1, 3 and 5

By
th
sub = np.array([arr[0], arr[1], arr[3], arr[5])
a
à works M
t ©
g h
à but what we really have is an array of indices np.array([0, 1, 3, 5])

r i
p y
à and NumPy supports specifying elements using an array of indices
instead of just a single index

C o
© MathByte Academy 836
y
Fancy Indexing e m
a d
à use an array of indices (an index array)
A c
arr = np.array([1, 2, 3, 4, 5, 6])
te
index_array = np.array([0, 1, 3, 5])
By
th
sub = arr[index_array]
a
M
©
à 1, 2, 4, 6

h t
à can also just define the index array inline

r i g
y
sub = arr[np.array([0, 1, 3, 5])]

p
C o
© MathByte Academy 837
y
Array Index Shape e m
a d
à shape of array index determines shape of selection
A c
e
arr = np.array([1, 2, 3, 4, 5, 6])

y t
B
arr[np.array([0, 1, 3, 4])] à [1, 2, 4, 5]

th
a
arr[np.array( M (4, )

[
t © [
[0, 1],
g h à
[1, 2],
[3, 4]
r i [4, 5]

y
] ]
)

o p (2, 2)
© MathByte Academy C 838
y
Fancy Indexing in Multiple Dimensions e m
a d
à fancy indexing can be applied to multiple axes
A c
[index_array, index]
te
[index, index_array]

[index_array, slice]
By
[slice, index_array]

th
[index_array, index_array]
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 839
y
index_array and index e m
a d
1 2 3 4 5 A c
arr[1, np.array([0, 1, 3])]
6 7 8 9 10
te
11
16
12
17
13
18
14
19
15
20 By
single row
21 22 23 24 25
th
à [6, 7, 9]

a
M
1
6
2
7
3
8
4
9
5
10
t © arr[np.array([0, 1, 3]), 1]

11 12 13 14 15
g h single column
16 17 18 19
r i20 à [2, 7, 17]
21 22 23

p y
24 25

C o à note how resulting array is 1-D

© MathByte Academy 840


y
index_array and slice e m
a d
1 2 3 4 5
A c
arr[1:3, np.array([0, 1, 3])
6 7 8 9 10
multiple rows
te multiple columns
11
16
12
17
13
18
14
19
15
20 à [
B y
21 22 23 24 25
th[6, 7, 9],

a [11, 12, 14]

M
]

1 2 3 4 5
t ©
arr[:, np.array([0, 3])
6
11
7
12
8
13
9
14
10

g
15h à [ [1, 4],
16 17 18 19
r i 20
[6, 9],
21 22 23
p y
24 25
[11, 14],
[16, 19],

C o ]
[21, 24]

© MathByte Academy 841


y
index_array and index_array e m
a d
à keep index arrays same shape
A c
à not commonly used – can be confusing for someone reading your code

te
1-D and 1-D

By
arr[np.array([0, 2]), np.array([1, 3])]
th
a
à think of this as zipping the indices from the two axes
à (0, 1) (2, 3) M
1 2 3 4 t
5 ©
6 7 8
g9h 10
à [2, 14]
11 12
r
13i 14 15
16
p
17
y 18 19 20

o
21 22 23 24 25

© MathByte Academy C 842


y
index_array and index_array e m
a d
2-D and 2-D

A
à again think of this as zipping up indices from both axesc
à but now our "index array" is really 2-D as well
te
arr[np.array([[0, 1], [3, 4]]),
By
np.array([[0, 2], [1, 3]])]
th
0 1 0 2 a
(0, 0) (1, 2)
M
à
3 4 1 3 (3, 1) (4, 3)

1 2 3 4 5
t ©
6 7 8 9
g
10h à [
[1, 8],
11 12 13 14
r i 15 [17, 24]
16
21
17
22
18
23p y
19
24
20
25
]

C o
© MathByte Academy 843
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 844
y
e m
a d
Masking A c
te
By
th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 845
y
Boolean Masking e m
a d
c
à use an expression that evaluates to a boolean for each element of an array
A
à make an array of those True/False values
te
y
à use that array to "filter" elements in another array
B
> 0 t
apply /ilterh
a
M
10 True 10
-10 False
20
-20
True
False t ©
20

30 True
g h 30
-30 False
r i
p y
C o
© MathByte Academy 846
y
Comparison Functions e m
a d
c
à functions which can be applied to each element of an array
A
te
à returns an array containing the result for each element

np.less(arr, value) By
th
a
à looks at every element of arr and evaluates element < value

M
arr = np.array([1, 2, 3, 4, 5])
t ©
np.less(arr, 4)
g h à [True, True, True, False, False]

r i
p y
C o
© MathByte Academy 847
y
NumPy Logic Functions e m
a d
à other functions exist:
A c
greater less_equal equal
te not_equal

By and more…

th
a
à https://numpy.org/doc/stable/reference/routines.logic.html

M
à but we can just use comparison operator symbols
< <= > >=
t ©
== !=

g h
r i
à using these will use the NumPy corresponding functions

p y
C o
© MathByte Academy 848
y
Applying the Mask e m
a d
à this array of True/False values is called a mask
A c
te
à we can apply this mask to an array (use same shaped arrays)

By
h
arr = np.array([1, 2, 3, 4])

a t
mask = np.array([True, True, False, True])

à or just use M
mask = arr != 3

t ©
arr[mask] à [1, 2, 4]
g h
r i
p y
à can do all this in a single statement arr[arr != 3]

C o
© MathByte Academy 849
y
Masking 2-D Arrays e m
a d
c
à masks will return a 1-D array, even if array being masked is 2-D
A
à basically applies mask element by element
te
arr = [ mask = arr != 3 By mask à [
[1, 2],
th [True, True],
[3, 4]
a [False, True]
]
M ]

t ©
arr[arr != 3]
g h
à [1, 2, 4]

r i à result is 1-D

p y
C o
© MathByte Academy 850
y
Combining Logical Operators e m
a d
à Python uses and or not
A c
à for NumPy we have to use
te
& and
By
| or
th
~ not (complement) a
M
à because of operator precedence, use () to group logic expressions

t ©
arr = np.arange(-10, 10)

g h
r i
arr[(arr > 0) & (arr % 2 == 0)]

p y
à [2, 4, 6, 8]

C o
© MathByte Academy 851
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 852
y
e m
a d
Universal FunctionsA c
te
B y
th
a
M
t ©
h
ir g
p y
C o
© MathByte Academy 853
y
e m
d
à earlier we saw that universal functions are vectorized functions

a
à they apply a function to each element of an array
A c
te
à the loop and function evaluation are done in C, not Python
à very fast
By
th
à we'll see how much faster in coding section

a
M
à NumPy has a large number of universal functions

à trig and hyperbolic


t ©
à math operations (arithmetic, logs, exponentials, sqrt, abs, …)

g h
à comparison functions (equal, less than, greater than, min/max, …)

r i
p y
https://numpy.org/doc/stable/reference/ufuncs.html#available-ufuncs

C o
© MathByte Academy 854
y
Universal Functions and Operators e m
a d
c
à add, subtract, multiply, divide, floor_divide, mod, power, …
A
e
à can be called as functions, with at least one argument being an array
t
np.add(arr_1, arr_2)
By
np.add(arr_1, scalar)
th
a
à or just use the + operator M
à Python will use np.add
t ©
g h
à similarly with -, *, /, //, %, **
r i
p y
C o
© MathByte Academy 855
y
Array and Array e m
a d
[a0, a1, a2] + [b0, b1, b2]
c
à[a0 + b0, a1 + b1, a2 + b2]
A
[a0, a1, a2] % [b0, b1, b2]
te
à[a0 % b0, a1 % b1, a2 % b2]

By
a00 a01 a02 b00 b01 b02
th a00 ** b00 a01 ** b01 a02 ** b02
a10 a11 a12
**
b10 b11 b12 aà
a10 ** b10 a11 ** b11 a12 ** b12
M
à keep array shapes the same
t ©
g h
à technically possible to use different shapes à broadcasting

r i
y
https://numpy.org/doc/stable/user/basics.broadcasting.html
p
C o
© MathByte Academy 856
y
Array and Scalar e m
a d
à simplest form of broadcasting
A c
[a1, a2, a3] * 3
e
à [a1, a2, a3] * [3, 3, 3]
t
B y
scalar
th broadcast to match shape
a
M
1 /
a00 a01 a02
à
t ©
1 1 1
/
a00 a01 a02
a10 a11 a12

g h 1 1 1 a10 a11 a12

r i
p y
C o
© MathByte Academy 857
y
Mismatched Shapes e m
a d
à sometimes possible
c
à not going to focus on this in this course
A
same number of elements
te
B y
a1 = [
th
a2 = [10, 20, 30]
[1, 2, 3],
a
M
[4, 5, 6],
[7, 8, 9]
]
t ©
h
[ [

g
[1, 2, 3], [10, 20, 30],
a1 * a2 à
y ri [4, 5, 6],
[7, 8, 9]
* [10, 20, 30],
[10, 20, 30]

o p ] ]

© MathByte Academy C 858


y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 859
y
e m
a d
c
Additional Math and Stats Functions
A
te
By
th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 860
y
e
à NumPy has a host of array manipulation and computational functionsm
a d
à trig/hyperbolic, logs/exponents
A c
e
à linear algebra (matrix/vector products, eigenfunctions/values, inverses, etc)

y t
à stats (averages, variances, correlations, histograms)
à discrete Fourier transforms
h B
a t
https://numpy.org/doc/stable/reference/routines.html
M
à simple /inancial functions
t ©
h
à mainly related to interest calculations

i g
à slated to be removed from NumPy
r
à don't use them

p y
C o
© MathByte Academy 861
y
Other More Specialized Libraries e m
a d
à many more specialized libraries
à usually built on top of NumPy and Pandas A c
te
à SciPy
y
interpolation, optimization, integration, linear algebra, stats, …
B
à statsmodels
th
regression, imputation, models, time series, …
a
à pyfolio M
portfolio performance and risk analysis

à QuantLib t ©
quantitative /inancial library

g h
à Quandl r i
useful for getting financial datasets directly into Python (not all

p y datasets are free)

C o
© MathByte Academy 862
y
Axes e m
a d
à recall discussion on axes
A c
l = [
te
y
[1, 0, 0],
rows
[0, 1, 0],
[0, 0, 1]
h B à also called axes

t
rows à axis 0
]
a columns à axis 1
columns
M
t ©
h
à many of the universal functions in NumPy can operate
g
r i
à on the array as a whole

p y
à along an axis

C o
© MathByte Academy 863
y
Max e m
a d
à 1-D is intuitive
c
np.amax(np.array([1, 2, 3])) à 3
A
axis 1
te
y
arr à

axis 0
1 2 3 4
5 6 7 8
h B
9 10 11 12
a t
M
np.amax(arr) à 12
t ©
h
à simply runs through all elements of array
g
r i
p y
C o
© MathByte Academy 864
y
Max e m
a d
à can specify an axis

A c
np.amax(arr, axis=0) à performs the operation across each row
te
y
(i.e. for each column)

1 2 3 4
h B
axis 0 5 6 7 8
9 10 11 12 a t
M
t ©
g h
i
[9, 10, 11, 12]
r
p y
C o
© MathByte Academy 865
y
Max e m
a d
(i.e. for each row) A c
np.amax(arr, axis=1) à performs the operation across each column

te
axis 1
[
B y
1 2 3 4
th 4,
5 6 7 8
a 8,
9 10 11 12
M ]
12

t ©
g h
r i
p y
C o
© MathByte Academy 866
y
Other Functions e m
a d
à some functions only operate element by element
sin sinh arcsin arcsinh A c
log exp around …
te
By
th
a
à some functions, like amax, that operate on groups of data, support axes
amax amin M
mean median std
t ©
sum
g h
cumsum product …

r i
y
https://numpy.org/doc/stable/reference/routines.math.html
p
C o
© MathByte Academy 867
y
Histogram e m
a d
c
à np.histogram à creates binned frequency distribution

a = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10] A
te
B y
3 5 3
th
bins à [0, 3) [3, 8) [8, 11)
0 3 8 10 11 a à define bin bounds using left edge,

M and rightmost edge (which is inclusive)

t © à bins = [0, 3, 8, 10]

h
np.histogram(a, bins_arr) à tuple: (array frequencies, bins array)
g
np.histogram(a, int)
yri à calculates evenly spaced bins in min/max range
à tuple: (array frequencies, bins array)

o
à other variants p https://numpy.org/doc/stable/reference/generated/numpy.histogram.html

© MathByte Academy C 868


y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 869
y
e m
Pandas c a d
A
te
By
th

28
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 870
y
e m
a d
c
à Pandas is built on top of NumPy

A
à data manipulation and analysis, focused on tabular and time series data

te
à arrays with rows and columns

By
h
à but uses labels to identify rows and columns
t
a
à in addition to positional indices
M
à columns in the same array can have different data types

t ©
g h
r i
p y
C o
© MathByte Academy 871
y
e m
a d
Series à 1-dimensional
A c
te
DataFrame à 2-dimensional

By
h
à a collection of Series objects

a t
M
Index à used to index Series and DataFrame objects

t ©
h
one of the key differences between Pandas and NumPy
g
r i
à NumPy array elements are indexed (implicitly) by position

p y
à in Pandas we can assign our own (explicit) labels

C o
© MathByte Academy 872
y
e m
à this section will cover some of the basics of Pandas
a d
à Pandas is a huge library
A c
te
à lots of data querying and manipulation functionality

B y
th
https://pandas.pydata.org/
a
à user guide
M
https://pandas.pydata.org/docs/user_guide/index.html

à API reference
t ©
https://pandas.pydata.org/docs/reference/index.html

g h
r i
p y
C o
© MathByte Academy 873
y
e m
a d
Indexes A c
te
By
th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 874
y
e m
à let me get something out of the way first J
a d
à index
A c
à indexes? à indices?
te
à both are correct
By
à I am not always consistent!
th
a
M
©
usually…

t
à I refer to elements of an index as indices
h
i g
à I refer to multiple index objects as indexes
r
p y
C o
© MathByte Academy 875
y
What is an index? e m
a d
à arrays / lists
['a', 'b', 'c', 'd']
A c
0 1 2
te
3

B y
h
element in array can be identi/ied by its (positional) index
t
avalue in a dictionary can be
à dictionaries
{ M identified via its key

t ©
'a': 1,
'b': 2,

g h 'c': 3

y ri }

p
à an index is a way to "look up" one or more values in an array or dictionary
o
© MathByte Academy C 876
y
Sequence Types e m
a d
c
à sequence types such as Python lists, tuple and NumPy arrays
A
e
à have a natural positional order to their elements
t
By
à this forms an implicit index on the sequence uses the
positional
l = ['a', 'b', 'c', 'd']
th l[0] indices
0 1 2 3
a l[1:4]
M
©
à with Pandas we can define an explicit index (in addition to the implicit index)

t
h
idx = ['first', 'second', 'third', 'fourth']
g
r i
l = [
p y 'a', 'b', 'c', 'd' ]

o
à we'll see how this works later
C
© MathByte Academy 877
y
Pandas Indexes e m
a d
à pd.Index à most generic type of Index

à they contain elements A c


te
à they are based on NumPy arrays

By
h
à they themselves have an implicit positional index
t
a
Python list. tuple,
NumPy array, …
M
idx = pd.Index([10, 20, 30, 40])
idx[0] à 10

t ©
h
idx[1:4] à Index([20, 30])
g
ri
idx[[0, 2]] à Index([10, 30])
y
returns an Index object

p
idx[idx % 4 == 0] à Index([20, 40])
o
© MathByte Academy C 878
y
Specialized Indexes e m
a d
à Int64 indexes
A c
for indexes that contain integer indices

te
y
à Float64 indexes for indexes that contain float indices

à Range indexes
h B
for integer sequence defined via a range

a t
à similar to difference between Python list and range

[0, 1, 2, 3, 4, 5] M à sequence is materialized


range(0, 6)
t © à sequence is not materialized

g h à elements are produced as

r i requested when iterating

p y
à Range indexes can be more efficient (storage and computation)

C o
© MathByte Academy 879
y
Indexes Have Set-Like Properties e m
a d
à can find the union and intersection of indexes
& à intersection A c
te
| à union
in à element of
By
th
a
M
à Pandas will use broadest data type needed for union/intersection

t ©
à RangeIndex indexes will try to return a RangeIndex as result of
union/intersection
g h
r i
à not always possible

p y
C o
© MathByte Academy 880
y
String, Integer and Float Indexes e m
a d
A c
à strings will result in an Index object, with an object data type (a catchall type)
pd.Index(['a', 'b', 'c'])
te
By
th
à integers will result in an Int64Index object
pd.Index([1, 2, 3]) a
M
©
à /loats will result in a Float64Index object
t
h
pd.Index([0.1, 0.2, 0.3])
g
r i
p y
C o
© MathByte Academy 881
y
Range Indexes e m
a d
à can create using the Python range object
A c
pd.Index(range(1, 10, 2))
te
By
à can use Pandas RangeIndex class directly
th
a
pd.RangeIndex(start, stop, step)
M
t ©
g h
r i
p y
C o
© MathByte Academy 882
y
e m
à index values do not have to be unique
a d
pd.Index([1, 1, 2, 2])
A c
à perfectly legal

te
y
but if we associate an index with a sequence, how does a non-unique index work?
B
A 10
th
B 20 a
A 30
M
©
B 40

h t
item at index A?
r i g à two items! à [10, 30]

y
à an index value may refer to multiple values in the associated array
p
o
à a bit different from Python dictionaries

C
© MathByte Academy 883
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 884
y
e m
a d
Series A c
te
By
th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 885
y
Python Sequences, NumPy Arrays e m
a d
à associative arrays

A c
l = [10, 20, 30, 40, 50]
te those sequences can be used to
0 1 2 3 4

By access (or reference) items in the


sequence
a = np.array([10, 20, 30, 40, 50])
th à they are also called indexes
0 1 2 3 4
a à based on position
M à positional index

t ©
h
à there is an association between the index and the values à associative array

g
i
à in Python lists, tuples, NumPy arrays, this positional index is implicit
r
y
à index provides a unique mapping between indices and values
p
C o
© MathByte Academy 886
y
Python Dictionaries e m
a d
à another type of associative array
à mapping between keys and values A c
à keys are not positional based
te
à do not even have to be numbers
B y
à but it's still an associative array th
a
d = {'a': 1, 'b': 2, 'c': 3}
M
t ©
a à 1

g h b à 2
index
r i c à 3

p y
à unique index à no implicit positional index

C o
© MathByte Academy 887
y
Pandas Series e m
a d
à another type of associative array
à has some dictionary-like properties A c
à has some sequence-like properties
te
B y
h
à it's a sequence type – so elements have a de/inite position in collection
t
à positional index
a
à can also de/ine an explicit index M à a second index

t0© 10 a

g h 1 20 b

r i 2 30 c explicit custom

p y
implicit positional index
à it's always there
3 40 d index
à indices are also

C o referred to as labels
© MathByte Academy 888
y
e m
d
0 10 a
1
2
20
30
b
c
c a
3 40 d
A
te
à can reference items by positional indices

B y [0] [1] …

à or by using the explicit index


h
['a']
t
['b'] …

a
à can even use slicing and fancy indexing
M
©
à even with an explicit index that is not numerical
t
h
['a': 'c']
g
r i
['a':'d':2]

p y
C o
© MathByte Academy 889
y
e m
à indexing works as expected
a d
à slicing has a twist
A c
à positional index [0:5]
te
à excludes endpoint

B y
à explicit index ['a':'c']
th à includes endpoint

a
0 1 2 3 M
10 20 30 40
t © [0:2] à 10, 20

a b c
g h
d
['a':'c'] à 10, 20, 30

r i
p y
Pandas understands that these are not positional indices (strings)

C o
© MathByte Academy 890
y
A point of confusion… e m
d
implicit index
0 1 2 3
c a
[100, 200, 300, 400]
A
explicit index
2 3 4 5
te
By
[2]
th
à is this using implicit index?
[2:3] à or explicit index?
a
M
©
if both implicit and explicit index are integers:
t
h
[2] à uses explicit index
g
y ri
[2:3] à uses implicit index
à can be confusing

o p
© MathByte Academy C 891
y
loc and iloc attributes e m
a d
c
à allows us to specifically indicate use of implicit or explicit index
A
te implicit index
0 1 2 3
B y
s = 100, 200, 300, 400
th explicit index
2 3 4 5 a
M
s.iloc[2]
t ©
à uses implicit index

g h
s.loc[2]
r i à uses explicit index

p y
C o note the square brackets

© MathByte Academy 892


y
Deleting Items e m
a d
à indexes are immutable
A c
e
à deleting an item would require deleting the corresponding index value
t
à instead use .drop() method
By
th
a
à returns a new series with new explicit index

M
t ©
g h
r i
p y
C o
© MathByte Academy 893
y
Creating Series objects e m
a d
from pandas import Series
A c implicit index
à from a dictionary
te
Series({'a': 1, 'b': 2})
By 0 1
1 2
explicit index

th a b

a
M
à from a list, specifying explicit index using another list

t ©
Series([1, 2], index=['a', 'b'])

g h
r i
p y
C o
© MathByte Academy 894
y
Series Attributes and Methods e m
a d
.index à returns the explicit Index object
A c
te
.values

By
à returns a NumPy array of the values

.items
th
à zip of explicit index values and array values
a
.iloc M
à used for indexing using implicit index

t ©
à used for indexing using explicit index
h
.loc

r i g
y
.drop à used to remove an element by explicit index

o p
© MathByte Academy C 895
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 896
y
e m
a d
DataFrames A c
te
By
th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 897
y
e m
d
à Series à analogous to 1-D NumPy array with an explicit index

c a
A
à DataFrame à analogous to a 2-D NumPy array with an explicit index

te
y
à for the rows

B
à and for the columns
h
another way to look ay it… a t
M
©
a DataFrame is a collection of Series objects

t
g h
à a common explicit index for the rows à series are aligned

r i
à the columns (Series) form a Series too à explicit index

p y à column names possibly

C o
© MathByte Academy 898
y
explicit index for rows explicit index for columns
e m
a d implicit index

county population gdp


A c area
for rows

The Bronx Bronx 1,418,207


te
42.695 42.10 0

Brooklyn Kings 2,559,903


B y91.559 70.82 1

th
Manhattan New York
a
1,628,706 600.244 22.83 2
M
©
Queens Queens 2,253,858 93.310 108.53 3

Staten Island Richmond


h t 476,143 14.514 58.37 4

r i g
y 0 1 2 3

o p implicit index for columns

© MathByte Academy C 899


y
à can think of it as a Series of Series e m
a d
à or a dictionary of dictionaries

A c
column index labels
{
te
y
'county': {

B
'The Bronx': 'Bronx',
'Brooklyn': 'Kings',

th
},
a row index labels
'population': {
M
'The Bronx': 1_418_207,

t ©
'Brooklyn': 2_559_903,

},
g

h
'gpd': { … },
r i
}

p y
C o
© MathByte Academy 900
y
Constructing a DataFrame e m
a d
pd.DataFrame(…)
A c
à from a list of Series objects
te
à from a list of lists
By
th
à from a list of dictionaries
a
à from a dictionary of Series objects M
t ©
à from a dictionary of dictionaries

g h
r i
à in some cases row and column explicit indexes are created as expected

p y
à in some cases we may have to define these indexes manually

C o
© MathByte Academy 901
y
Some DataFrame Properties and Methods e
m
a d
.info()
A c
à prints some useful info about the data frame

te
y
.transpose() à transposes the data frame, maintaining indexes

.rename()
h B
à allows us to rename the index labels (rows and/or columns)

a t
M
.set_index() à use an existing column in the data frame as a row index

.index
t ©
à the Index object used to index the rows

h
ir g
.columns à the Index object used to index the columns

.drop()
p y à used to drop rows/columns from the data frame

C o
© MathByte Academy 902
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 903
y
e m
a d
Selecting Data A c
te
By
th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 904
y
DataFrames e m
a d
axis 1 - columns
à analogous to a Series of Series
A c
e
axis 0 - rows
t
à or a dictionary of lists / dictionaries
B y
th
{
"col0": { a
M
"row0": value,

},
t © "row1": value

h
"col1": {

g
r i "row0": value,
"row1": value

p y }

C o }
à a sequence of aligned columns

© MathByte Academy 905


y
e m
d
à consider it as a Series of Series c1 c2 c3
r1 1
c
df = r2 4 a 2
5
3
6
Ar3 7 8 9
à c1 is a series of values
te
à c2 is a series of values
y
share a common row index ['r1', 'r2', 'r3']
B
h
à c3 is a series of values

a t
à df is like a Series [c1, c2, c3] with index ['c1', 'c2', 'c3']
M
à or like a dictionary {'c1': c1, 'c2': c2, 'c3': c3}

t ©
g h
df['c1'] à this selects the item with label 'c1'

r i
p y
à the column (series) c1

o
à note that [] cannot be used with positional indices with DataFrame objects

C
© MathByte Academy 906
y
loc and iloc e m
a d
à just like with Series, but with 2 axes
à loc uses the explicit index A c
te
à iloc uses the implicit (positional) index

B y
th
à but think of DataFrame like a NumPy array with two axes

a
axis 1 (columns)
M
df.loc[v1, v2]
t © df.iloc[v1, v2]

g h
r i axis 0 (rows)

p y
o
à slicing and fancy indexing works the same way as with Series, but using 2 axes

C
© MathByte Academy 907
y
Replacing Values e m
a d
à can replace values using assignment (==) operator
A c
à replace single selected cell
te
à with a scalar value
By
th
a
à replace multiple cells selected using slicing/fancy indexing

M
à with a 2-D NumPy array/list of lists of same shape

t ©
à with a scalar value that will be broadcast

g h
à with a 1-D NumPy array that will be broadcast

r i
y
à can replace with a Series or DataFrame but indexes can cause issues!
p
C o
© MathByte Academy 908
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 909
y
e m
a d
Missing Values A c
te
By
th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 910
y
Python e m
a d
à None object
A c
à can be used to indicate unde/ined or missing in a sequence
[1, 2, None, 4]
te
B y
h
à IEEE standard for floats also has the concept of an undefined float
t
à NaN (not a number) a
M
à float('nan')

t ©
à math.nan

g h
à np.nan
r i
p y
C o
© MathByte Academy 911
y
Equality of NaN e m
a d
à two NaN values always compare False
A c
te
à cannot compare two undefined (unknown) values…
a = math.nan
By
a == b à False
b = math.nan
th
a is b à False

a
M
à so how do we test if a number is NaN?

à math.isnan()
t ©
g h
math.isnan(np.nan) à True

r i
y
à NumPy universal function np.isnan()
p
C o
© MathByte Academy 912
y
Pandas Series e m
a d
à if the series is a series of /loats
A c
nan à nan
te
None à nan

B y series was made into a float


pd.Series([1, 2, None, np.nan])
th
a
M
à [1.0, 2.0, NaN, NaN], dtype=float64

t ©
à if the series is a series of object (for example for series of strings)

g h
r i
pd.Series(['a', 'b', None, np.nan]) None was not

p y converted to NaN

C o
à ['a', 'b', None, NaN], dtype=object

© MathByte Academy 913


y
Testing for Missing Data e m
a d
à could be None à could be NaN
A c
te
à pd.isnull() à handles both

By
h
à universal function (operates on Series or DataFrames)

a t
à returns element by element comparison
M
à True if value is None or NaN

t ©
à pd.notnull()
g h
à similar to isnull(), but opposite result
r i
p y
C o
© MathByte Academy 914
y
Replacing Series Missing Data e m
a d
à use loops to iterate and replace missing values
A c
à specialized Pandas functions
te
à s.fillna(value)
B y
à replaces any null with specified value

th
à s.fillna(method=…)
a
à method = 'ffill'
M
à forward fill
t ©
null, 1, null, 2, null, null

g h
y ri à [null, 1, 1, 2, 2, 2]

p
à method = 'bfill'
o
à backward /ill

© MathByte Academy C 915


y
Replacing DataFrame Missing Data e m
a d
à works same as Series replacement
à but the axis is important for back/forward fills
A c
t
back/forward Cill along axis 1e
0.0 0.1 0.2
B
0.3
y
or along axis 0 1.0 NaN 1.2
th1.3
2.0 2.1 NaN
a 2.3
3.0 3.1
M
3.2 3.3

t ©
g h
df.fillna(method='ffill', axis=0) 2.0 2.1 1.2 2.3

r i
p y
df.fillna(method='ffill', axis=1) 2.0 2.1 2.1 2.3

C o
© MathByte Academy 916
y
Interpolating Missing Data e m
a d
à more advanced techniques
A c
à linear interpolation
te
à splines …
By
th
a
à beyond scope of this course M
t ©
à but will look at simple linear interpolation in code

g h
r i
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.interpolate.html

p y
C o
© MathByte Academy 917
y
Dropping Data e m
a d
à already saw this for Series objects
A c
à DataFrame is 2-D 0.0 0.1
te 0.2 0.3
1.0
2.0
B y
NaN
2.1
1.2
NaN
1.3
2.3

t
3.0
h 3.1 3.2 3.3
à do we delete rows with missing values?
a
M
à or do we delete columns with missing values?
à need to specify an axis
t ©
df.dropna(axis=0)
g h
r i
df.dropna(axis=1)

p y
à axis defaults to 0 if we don't specify it

C o
© MathByte Academy 918
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 919
y
e m
a d
Loading Data A c
te
By
th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 920
y
e m
à Pandas has built-in functions for loading many types of data
a d
in this lecture we'll look at
A c
à CSV files te
à Excel files By
th
a
M
à many other data sources are supported (SQL, JSON, SAS, SPSS, etc)

t ©
h
https://pandas.pydata.org/pandas-docs/stable/reference/io.html

g
r i
p y
C o
© MathByte Academy 921
y
Loading a CSV File e m
a d
c
à pd.read_csv(<file_name>)

à has many optional arguments A


te
à header B y
à sep and delimiter (just like Python's csv.reader)
row number to use as column labels, otherwise
infers them
th
a
M
à usecols a list of positional indexes indicating which
columns to keep
à names
t ©
renames the columns
à index_col
g h
speci/ies (by name or index) which column to use

r i as the row index

p y
o
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html

C
© MathByte Academy 922
y
Loading an Excel File e m
a d
à Pandas relies on external 3rd
à many exist, such as xlrd, openpyxl
A c
party libraries to read Excel files

te
à need to pip install the library in your virtual env

B y
à already done if you followed install at beginning of course

th
à pd.read_excel('file_name')
a
à sheet_name
M
the sheet name, or index (zero based) to load
à header
t ©
à usecols

g h
à names
r i
à index_col
p y and more…

o
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_excel.html

C
© MathByte Academy 923
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 924
y
e m
a d
Basic Data AnalysisA c
te
B y
th
a
M
t ©
h
ir g
p y
C o
© MathByte Academy 925
y
e m
à basic facts about a loaded data set
a d
.info()
A c
àcolumn names, types, not-null counts

te
y
.describe() à mean, min, max, quartiles, std dev
B
h
à by default only includes numerical columns
t
à include='all' a
M
à categorical columns

t ©à # unique values

g h à most frequent value + frequency

r i
p y
àoutput is "print" output

C o
© MathByte Academy 926
y
e m
à equivalent methods to obtain the same data
a d
.nunique() à # of unique values
A c
.unique()
te
à array of unique values

.value_counts() B y
à Series of values and their frequency
th
.count() a
M
©
.mean()

.std()
h t
r i g
.quantile()

p y
C o
© MathByte Academy 927
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 928
y
e m
a d
Sorting and Filtering A c
te
B y
th
a
M
t ©
h
ir g
p y
C o
© MathByte Academy 929
y
Filtering e m
a d
à boolean masking
A c
t
à works similarly to NumPy and Series maskinge
à create a boolean masking array
By
th
à apply mask to data frame
a
à use explicit or implicit indexM
mask = df['col'] >= 0
t ©
g h
i
mask = df.iloc[:, 2] >= 0
r
p y
df[mask]

C o
© MathByte Academy 930
y
Sorting e m
a d
à sort rows based on the row index labels
A c
df.sort_index()
te
By
à sort rows based on values in a column
th
a
M
df.sort_values('col label')

t ©
à similarly to Python's sorted() function, these support a key argument

g h
r i
p y
C o
© MathByte Academy 931
y
Reviewing sorted(key=…) e m
a d
l = ['Z', 'a', 'b']
A c
sorted(l, key=lambda x: x.casefold())
te
By
à key is a function that transforms each element of l, one by one

th
l = ['Z', 'a', 'b'] a
M
sort_keys = ['z', 'a', 'b']
t ©
g h
i
à sorting is then based on sort_keys
r
y
à sort by an associated series of keys
p
C o
© MathByte Academy 932
y
The key Argument for DataFrames e m
a d
à sort by an associated series of keys
A c
te
à instead of using a function that generates the keys one by one

By
à use a vectorized function to generate the sequence of sort keys all at once

th
a
M
à key function receives a Series as its argument

©
à should return a Series object with same shape
t
h
s = Series([1, -1, 2, -2])
g
r i
p y
key = np.abs(s) à Series([1, 1, 2, 2])

C o
© MathByte Academy 933
y
Sorting by Index e m
a d
a
B
1
4
2
5
3
6
A c
c 7 8 9
te
By
def sort_func(ind):
th sort_func(df.index)
return ind.str.casefold()
a à ['a', 'b', 'c']
M
df.sort_index(key=sort_func)
t ©
or
g h
r i
y
df.sort_index(key=lambda ind: ind.str.casefold())

p
C o
© MathByte Academy 934
y
Sorting By Values e m
a d
à same as sorting by index
A c
à uses some specified column instead of index
te
c1 c2 c3
By
a 1 -2 3
th
df.sort_values('c1')
B -4 5 6
a
M
c 7 8 -9 à sorts based on values in c1

t ©
c1 c2 c3

g h B -4 5 6

r i a 1 -2 3

p y
index is preserved
c 7 8 -9

C o
© MathByte Academy 935
y
Sorting by Values with a key e m
a d
c
à key function receives the sort by column (Series) as its argument
A
e
c1 c2 c3
a 1 -2 3
y t
B -40 5
c 7
6
8 -9
h B
a t
df.sort_values('c1', key=lambda col: np.abs(col))
M
à returns a new Series t ©
à key function receives column c1 as its argument

h
à 1, 40, 7

i
c1 c2 c3
r g
p
c 7 y
a 1 -2 3
8 -9

C o B -40 5 6

© MathByte Academy 936


y
Sorting on Multiple Columns e m
a d c1
a
c2
-1
c3
100
à can specify a multi-level sort based on multiple columns
A c z
a
2
-3
200
300

te a 10 400

By z -1 500

th
a
df.sort_values('c1') à stable sort based on c1 column c1 c2 c3
a -1 100
M a -3 300

t © a
z
10
2
400
200
h
df.sort_values(['c1', 'c2']) c1 c2 c3
z -1 500
i g
à sorts on c1, then c2
r
a
a
-3
-1
300
100

p y a 10 400

o
z -1 500

C
z 2 200
© MathByte Academy 937
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 938
y
e m
a d
Manipulating Data A c
te
B y
th
a
M
t ©
h
ir g
p y
C o
© MathByte Academy 939
y
e m
à vectorized operations similar to NumPy arrays
a d
.count .sum .prod
A c
.min .max .mean .std
te
By
th
à works across all elements of DataFrame
a
à or along a speci/ied axis
M
à regular arithmetic operators
t ©
g h
r i
à NumPy universal functions

p y
C o
© MathByte Academy 940
y
e m
à .transpose()
a d
a
A 0
b
1
c
2
A
a 0
B
3
C
6 A c
B 3 4 5
à
b 1 4 7
te
C 6 7 8 c 2 5
B
8 y
th
à to_numeric() a
à int, float M
t ©
à entries that cannot be converted result in an exception

g h
i
à can override this behavior using the errors argument
r
p y
errors = 'coerce'

C o
© MathByte Academy 941
y
Concatenating DataFrames e m
a d
à concatenate along an axis
pd.concat([df1, df2, …], axis=0|1) A c
te
axis = 1 à horizontally
y
axis = 0 à vertically
B
th
à uses row or column index to "align" concatenated rows/columns
a
axis = 1 M a b c d e f
a b c d
te© f r1 1 2 3 1 2 3
r1 1 2 3
h
r1 1
g
2 3
à
r2 4 5 6 4 5 6
r2 4
r3 7
5
8
6
9 r i r2 4
r4 7
5
8
6
9
r3
r4
7
nan
8
nan
9
nan
nan
7
nan
8
nan
9

p y
C o à this is called an outer join

© MathByte Academy 942


y
Concatenating DataFrames e m
a d
à outer joins are the default
A c
à in an outer join "missing" data in the join are replaced with NaN

e
a b c d e f
a b c d e f r1
r2
1
4 y t 2
5
3
6
1
4
2
5
3
6
B
r1 1 2 3 r1 1 2 3
r2 4 5 6 r2 4 5 6 r3
th 7 8 9 nan nan nan

a
r3 7 8 9 r4 7 8 9 r4 nan nan nan 7 8 9

M
à in an inner join, missing rows/columns are dropped entirely
a
r1 1
b
2
c
3
d
r1 1
e
t
2 © f
3
a b c d e f
r2 4 5 6 r2 4
g h 5 6
r1 1 2 3 1 2 3
r3 7 8 9
r i
r4 7 8 9
r2 4 5 6 4 5 6

p y
pd.concat([df1, df2], axis=1, join='inner')

C o
© MathByte Academy 943
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 944
y
e m
Matplotlib c a d
A
te
By
th

29
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 945
y
e m
à Matplotlib is a popular graphing library
a d
https://matplotlib.org/
A c
à integrates well with Jupyter Notebooks
te
àthere are many others available too
B y
th
à geoplotlib
a
maps, geographical data

à ggplot M
little simpler than matplotlib, not as

t ©
customizable (based on matplotlib)

g h
à plotly
r i interactive plots/web, contour plots, 3D, …

p y
and more…

C o
© MathByte Academy 946
y
à numerous extension packages to Matplotlib
e m
a d
c
à financial
à maps and map projections A
te
à specialty axes (like broken axes)

B y
à electronic circuits
th
à Venn diagrams
a
à density maps M
à statistical maps
t ©
g h
ri
à ML visualizations and many more…

p y
https://matplotlib.org/3.1.0/thirdpartypackages/index.html

C o
© MathByte Academy 947
y
e m
a d
à we'll look at how to create and theme various Matplotlib charts
à single plots
A c
à overlayed plots
te
à grids of plots
By
th
a
M
à we'll look at OHLC plots using mplfinance extension

©
https://github.com/matplotlib/mpl/inance
t
g h
r i
à pip install matplotlib

p y
pip install mplfinance

C o
© MathByte Academy 948
y
e m
a d
Matplotlib Basics A c
te
B y
th
a
M
t ©
h
ir g
p y
C o
© MathByte Academy 949
y
Imports e m
a d
c
à two sections of the matplotlib library we will use often
A
import matplotlib as mpl
te
import matplotlib.pyplot as plt By
th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 950
y
Styles e m
a d
mpl.style.available
A c
e
à returns a list of the various styles available on your system
t
à a list of strings
B y
th
use the exact string as listed above
a
mpl.style.use('…') M
t ©
à sets your notebook to use a particular style

g h
r i
p y
à see a preview of various styles
https://matplotlib.org/3.2.1/gallery/style_sheets/style_sheets_reference.html

C o
© MathByte Academy 951
y
Anatomy of Figures e m
figure legend
a d
Title blue line
red line
A c
te scatter plot markers

major ticks
B y
y-axis label th line plots
a
M
minor ticks

Axes
t ©
g h
ri
0 1 2 3 4 5
0.5

1.5

2.5

3.5

4.5
0.25

0.75

1.25

1.75

2.25

2.75

3.25

3.75

4.25

4.75
x-axis label

p y
C o
major tick labels minor tick labels

© MathByte Academy 952


y
Creating a Figure and Axes e m
a d
c
à simplest is to use subplots() function in pyplot module
A
import matplotlib.pyplot as plt
te
plt.subplots()
By
th
à creates a new figure and one Axes object
à returns it as a tuple a
M
à and displays the figure in Jupyter

t ©
h
fig, ax = plt.subplots()
g
r i
à blank chart

p y
à need to specify something to plot

C o
© MathByte Academy 953
y
Plotting Data e m
a d
à we add a plot to an Axes
A c
ax.plot(x_coords, y_coords, label='…')
te
By
à this adds a (line) plot to the Axes object, with specified plot name (used in legend)

th
a
à x and y coordinates can be lists, NumPy arrays, Pandas columns, …

M
à can keep adding more plots to same Axes

t ©
h
à we have to display the figure to see the result

g
r i
à typically create figure and plots in a single Jupyter cell

p y
C o
© MathByte Academy 954
y
Additional Axes Settings e m
a d
ax.set_xlabel('…')
A c
à sets the x-axis label

te
ax.set_ylabel('…')

B y
à sets the y-axis label

ax.set_title('…')
th
à sets the title

a
M
ax.legend() à creates and adds a legend

t ©
à note how all these are applied to the Axes object

g h
r i
à later we'll see how to add multiple Axes to the same figure

p y
C o
© MathByte Academy 955
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 956
y
e m
a d
Multi Plots A c
te
By
th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 957
y
Multiple Plots on Same Axes e m
a d
à saw this in previous lecture
A c
à just keep adding plot to same Axes object
te
ax.plot(…)
By
ax.plot(…)
th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 958
y
Multiple Axes on Same Figure e m
a d
à we can also chart multiple Axes on the same /igure
A c
à grid layout
te
à number of columns
By
à number of rows

th
a
plt.subplots(n_rows, m_columns)

à creates the figure M


t ©
à creates n * m Axes objects laid out as speci/ied

g h
à returns figure and collection of Axes as a 2-value tuple

r i
à first element is the figure

p y
à second element is a NumPy ndarray with all the Axes

C o
© MathByte Academy 959
y
Setting Figure Size e m
a d
c
à technically size is defined in width and height in inches
A
te
à what that shows up as on your screen will depend on your resolution (dpi)

à default is 6.4 (w) x 4.8 (h)


By
th
à can specify for a single figure
a
M
plt.subplots(figsize=(width, height))

t ©
h
à or speci/ied as a global change

g
r i
plt.rcParams['figure.figsize'] = [width, height]

p y
à play around with width/height until you /ind a setting you like

C o
© MathByte Academy 960
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 961
y
e m
a d
More Plot Types A c
te
By
th
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 962
y
Plot Styles e m
a d
à plot(…) a line plot
A c
te
By
h
à bar(…) vertical bar plot

a t
M
à scatter(…)
t ©
scatter plot

g h
r i
y
à plenty more…

p
https://matplotlib.org/3.1.1/gallery/index.html
o
© MathByte Academy C 963
y
Adding Vertical/Horizontal Lines to Axes e m
a d
A c
à sometimes useful to add vertical/horizontal lines to a chart

te
à display info such as mean, median, other important "values"

B y
ax.axhline(y=…, xmin=0, xmax=1)
th
ax.axvline(x=…, ymin=0, ymax=1) a
M
xmin/xmax
t ©
à values between 0 and 1

g h
à 0 indicates left edge, 1 indicates right edge

r i
ymin/ymax
p y à values between 0 and 1

C o à 0 indicates bottom edge, 1 indicates top edge

© MathByte Academy 964


y
Histograms e m
a d
c
à can use NumPy to generate histogram data, and then use Matplotlib
A
à but also built-in to Matplotlib directly
te
ax.hist(data, bins=…)
By
th
a
M
à data is the data for which we want to generate the histogram

©
à bins specifies the number of bins we want to use
t
g h
r i
p y
C o
© MathByte Academy 965
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 966
y
e m
a d
Charting with mplfinance A c
te
B y
th
a
M
t ©
h
ir g
p y
C o
© MathByte Academy 967
y
mplfinance e m
a d
https://github.com/matplotlib/mplfinance
A c
te
y
à add-on to Matplotlib that provides extra plot types

B
pip install mplfinance
th
a
M
à import it in Jupyter

t ©
import mplfinance as mpf
g h
r i
p y
à we'll use it for candlestick/OHLC charts

C o
© MathByte Academy 968
y
Plotting OHLC Charts e m
a d
à simplest is to arrange a Pandas data frame as follows:
A c
à index is the datetime for each row
te
à /ive data columns in speci/ic order
By
t
à Open, High, Low, Close, Volume
h
a
mpf.plot(data_frame) M
t ©
g h
r i
p y
C o
© MathByte Academy 969
y
Additional Plot Arguments e m
a d
à type
A c
used to specify chart type (e.g. 'ohlc', 'candle')

te
à mav

By
used to superimpose one or more moving averages
à single value for single mav, tuple of values for multiple

th
à volume a
True to display Volume bar chart (defaults to False)
M
à show_nontrading
t ©
True to show non-trading days in chart
(gaps), defaults to False

g h
r i
p y
C o
© MathByte Academy 970
y
Superimposing Plots e m
a d
à create subplots
A c
e
à plots = mpf.make_addplot(…)

y t
à add them to main plot when creating it
h B
à mpf.plot(…, addplot=plots) a t
M
t ©
g h
r i
p y
C o
© MathByte Academy 971
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 972
y
e m
Conclusion c a d
A
te
By
th

30
a
M
t ©
g h
r i
p y
C o
© MathByte Academy 973
y
e m
a d
A c
te
B y
Congratulations!!!th
a
M
t ©
h
ir g
p y
C o
© MathByte Academy 974
y
What we covered e m
a d
à the Python language
A c
à basic and advanced data types
te
à some of the Python standard library
By
à some popular 3rd party libraries th
a
M
requests à Web and API requests, JSON

t ©
Numpy à ef/icient array computations

g h
Pandas à loading and manipulating data

r i
y
Matplotlib à charting data
p
C o
© MathByte Academy 975
y
Important References e m
a d
www.python.org
A c
te
y
numpy.org

pandas.pydata.org
h B
a t
M
matplotlib.org

©
requests.readthedocs.io/en/master/
t
g h
r i
p y
C o
© MathByte Academy 976
y
Practice! e m
a d
à to learn programming: practice, practice, practice
à yes, it can be hard A c
à yes, you'll make mistakes
te
By
à even seasoned devs struggle and make mistakes – often!

th
àread (and understand) other peoples' code
a
M
à if you work with other devs, review each other's code

t ©
à or find a developer friend who can do that
à look at open source projects (GitHub)

g h
r i
à be patient – don't give up

y
à you won't become an expert in 3 weeks
p
o
à as you write more code, you'll become more and more proficient

C
© MathByte Academy 977
y
Practice Sites e m
a d
c
à there are many sites where you can practice with coding challenges
A
edabit.com
te
By
h
coderbyte.com

www.codewars.com a t
M
www.hackerrank.com

t ©
and many more…
g h
r i
p y
C o
© MathByte Academy 978
y
Additional Resources e m
a d
stackoverClow.com à if you have a coding question
A c
e
à you probably weren't the first with that question
t
B y
à you will probably /ind an answer, at least close
à if not, you can post your question
th
a
à just browse questions/answers – incredibly informative

M
Python Cookbook, by Beazley and Jones (O'Reilly Press)

t ©
Fluent Python, by Ramalho (O'Reilly Press)

g h
ri
YouTube à experts such as Hettinger, Beazley, Martelli, PyCon talks

Twitter
p y à Raymond Hettinger (@raymondh)

C o
© MathByte Academy 979
y
e m
a d
A c
te
yB
Codingth
a
M
t ©
g h almost kidding!

r i
p y
C o
© MathByte Academy 980
y
e m
à try executing these (separately) in a Jupyter cell
a d
import this A c
te
import __hello__
By
th
a
import antigravity

M
©
à if you're a dev who would rather use braces {} for code blocks
t
h
from __future__ import braces
g
r i
p y
C o
© MathByte Academy 981
y
e m
a d
c
I hope you enjoyed this course as much as I enjoyed creating it
A
te
B y
Thank You!! th
a
M
t ©
h
ir g
p y
C o
© MathByte Academy 982

You might also like