You are on page 1of 350

Python for Econometrics

Lecturer: Fabian H. C. Raters

Institute: Econometrics, University of Goettingen

Version: October 31, 2020

© 2020 PyEcon.org. All rights reserved. Python is a trademark of the PSF.


Learning Python for econometrics 2
Essential
concepts
Getting started Welcome to this course and to the world of Python!
Procedural
programming
Object-orientation
Learning objectives of this course:
Numerical
programming
NumPy package Python: The course is about Python programming.
Array basics
Linear algebra
for : You will learn tools and methods.
Data formats and
handling Econometrics:
Pandas package
Series
Statistics: Numerical programming in Python.
DataFrame applied to: We will use it on examples.
Import/Export data
Economics: In an economic context.
Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Learning Python for econometrics 3
Essential
concepts
Getting started Knowledge after completing this course:
Procedural
programming
Object-orientation
You have acquired a basic understanding of programming in general
Numerical
programming
with Python and a special knowledge of working with standard
NumPy package numerical packages.
Array basics
Linear algebra
You are able to study Python in depth and absorb new knowledge
Data formats and
handling for your scientific work with Python.
Pandas package
Series
You know the capabilities and further possibilities to use Python
DataFrame
in econometrics.
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Learning Python for econometrics 4
Essential
concepts
Getting started What you should not expect from this course:
Procedural
programming
Object-orientation
A guide how to install or maintain an application.
Numerical
programming An introduction to programming for beginners.
NumPy package
Array basics An introduction to professional development tools.
Linear algebra

Data formats and Non-scientific, general purpose programming (beyond the language
handling
Pandas package
essentials).
Series
DataFrame
Few content and less effort...
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Course organisation 5
Essential
concepts
Getting started This course can be seen as an applied lecture:
Procedural
programming
Object-orientation Lecture:
Numerical We try to explain the partly theoretical knowledge on Python by sim-
programming
NumPy package ple, easy to understand examples. You can learn the programming
Array basics
Linear algebra
language’s subtleties by reading literature.
Data formats and Exercises:
handling
Pandas package Digital work sheets in the form of Jupyter notebooks with applied
Series
DataFrame
tasks are available for each chapter. For all exercises there are sample
Import/Export data solutions available in separate notebooks.
Visual
illustrations Self-tests:
Matplotlib package
Figures and subplots
At the end of each of the five chapters there are typical exam questions.
Plot types and styles
Pandas layers
Written exam:
Applications There will be a final exam. This will be a pure multiple choice exam:
Time series
60 questions, 90 minutes.
Moving window
Financial applications
Optimization
After the successful participation in the exam you will receive 6 ECTS.

© 2020 PyEcon.org
Literature 6
Essential
concepts
Getting started The programming language Python is already established and very well
Procedural
programming in trend for numerical applications. Some keywords:
Object-orientation

Numerical
programming
Data science,
NumPy package
Array basics
Data wrangling,
Linear algebra
Machine learning,
Data formats and
handling
Pandas package
Numerical statistics,
Series
DataFrame
...
Import/Export data

Visual Recommended literature while following this course:


illustrations
Matplotlib package Learning Python, 5th Edition by Mark Lutz,
Figures and subplots
Plot types and styles Python Crash Course, 2nd Edition by Eric Matthes,
Pandas layers

Applications Python Data Science Handbook by Jake VanderPlas,


Time series
Moving window Python for Data Analysis, 2nd Edition by Wes McKinney,
Financial applications
Optimization Python for Finance, 2nd Edition by Yves Hilpisch.

© 2020 PyEcon.org
Software: Python 3 7
Essential
concepts
Getting started We are using Python 3. There was a big revision in the migration
Procedural
programming from Python 2 to version 3 and the new version is no longer backwards
Object-orientation
compatible to the old version.
Numerical
programming
NumPy package Python 3 running [command line]
Array basics
Linear algebra python --version
Data formats and
handling
Pandas package ## Python 3.8.5
Series
DataFrame
Import/Export data The normal execution mode is that the Python interpreter processes
Visual
illustrations
the instructions in the background – in other numeric programming
Matplotlib package languages such as R this is known as batch mode. It executes program
Figures and subplots
Plot types and styles
code that is usually located in a source code file.
Pandas layers
The interpreter can also be started in an interactive mode. It is used
Applications
Time series for testing and analytic purposes in order to obtain fast results when
Moving window
Financial applications
performing simple applications.
Optimization

© 2020 PyEcon.org
Software: IDEs 8
Essential
concepts
Getting started For everyday work with Python it would be extremely tedious to make
Procedural
programming all edits in interactive mode.
Object-orientation

Numerical
There are a number of excellent integrated development environments
programming
NumPy package
(IDEs) for Python, with three being emphasized here:
Array basics
Linear algebra
Jupyter (and IPython)
Data formats and
handling Spyder (scientific IDE)
Pandas package
Series PyCharm (by IntelliJ)
DataFrame
Import/Export data

Visual
Of course, you can also use a simple text editor. However, you would
illustrations
Matplotlib package
probably miss the comfort of an IDE.
Figures and subplots
Plot types and styles
Installing, adding and maintaining Python is not trivial at the beginning.
Pandas layers Therefore, as a beginner, you are well advised to download and install
Applications the Python distribution Anaconda. Bonus: Many standard packages
Time series
Moving window are supplied directly or you can post-install them conveniently.
Financial applications
Optimization

© 2020 PyEcon.org
Following this course 9
Essential
concepts
Getting started In this course – in a numeric and analytic context – we use only Jupyter
Procedural
programming with the IPython kernel.
Object-orientation

Numerical
That is why we have combined
programming
NumPy package
Array basics
1 all the code from the slides, and
Linear algebra
2 all the exercises and solutions
Data formats and
handling
Pandas package
into interactive Jupyter notebooks that you can use online without
Series
DataFrame having to install software locally on your computer. The GWDG has
Import/Export data
set up a cloud-based Jupyter-Hub for you.
Visual
illustrations
Matplotlib package
You can access the working environment with your university credentials
Figures and subplots at
Plot types and styles
Pandas layers https://jupyter-cloud.gwdg.de/
Applications
Time series
create a profile and get started right away – even using your smart
Moving window devices. However, so far you are still asked to upload the course
Financial applications
Optimization notebooks by yourself or rewrite the code from scratch.

© 2020 PyEcon.org
Notebook workflow 10
Essential
concepts
Getting started A Jupyter notebook is divided into individual, vertically arranged cells,
Procedural
programming which can be executed separately:
Object-orientation

Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization The notebook approach is not novel and comes from the field of
computer algebra software.
© 2020 PyEcon.org
Notebook workflow 11
Essential
concepts
Getting started Actually, an interactive Python interpreter called IPython is started “in
Procedural
programming the core”.
Object-orientation

Numerical IPython running [command line]


programming
NumPy package ipython --version
Array basics
Linear algebra

Data formats and ## 7.19.0


handling
Pandas package
Series Roughly speaking, this is a greatly enhanced version of the Python
DataFrame
Import/Export data
3 interpreter, which has numerous, convenient advantages over the
Visual “normal” interpreter in interactive mode, such as, e.g.,
illustrations
Matplotlib package printing of return values,
Figures and subplots
Plot types and styles color highlighting, and
Pandas layers

Applications
magic commands.
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Following this course 12
Essential
concepts
Getting started
Procedural
programming
Object-orientation

Numerical
programming
NumPy package
Array basics
Linear algebra Finally, we wish you a lot of fun and success with and in this course!
Data formats and
handling
Pandas package Practice makes perfect!
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots Contribution and credits:
Plot types and styles
Pandas layers
Fabian H. C. Raters
Applications
Time series Eike Manßen
Moving window
Financial applications
GWDG for the Jupyter-Hub
Optimization

© 2020 PyEcon.org
Table of contents 13
Essential
concepts
Getting started
Procedural
programming
1 Essential concepts 4 Visual illustrations
Object-orientation
1.1 Getting started 4.1 Matplotlib package
Numerical
programming 1.2 Procedural programming 4.2 Figures and subplots
NumPy package
Array basics
1.3 Object-orientation 4.3 Plot types and styles
Linear algebra
2 Numerical programming 4.4 Pandas layers
Data formats and
handling 2.1 NumPy package 5 Applications
Pandas package
Series
2.2 Array basics 5.1 Time series
DataFrame 2.3 Linear algebra 5.2 Moving window
Import/Export data

Visual 3 Data formats and handling 5.3 Financial applications


illustrations
3.1 Pandas package 5.4 Optimization
Matplotlib package
Figures and subplots
Plot types and styles
3.2 Series
Pandas layers 3.3 DataFrame
Applications 3.4 Import/Export data
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Chapter 1 14
Essential
concepts
Getting started
Procedural
programming
Object-orientation
Essential concepts
Numerical
programming
NumPy package
Array basics
1.1 Getting started
Linear algebra

Data formats and


1.2 Procedural programming
handling
Pandas package 1.3 Object-orientation
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Section 1.1 15
Essential
concepts
Getting started
Procedural
programming
Object-orientation
Essential concepts
Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


I Getting started
handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Motivation for learning Python 16
Essential
concepts
Getting started Python can be described as
Procedural
programming
Object-orientation
a dynamic, strongly typed, multi-paradigm and object-oriented
Numerical
programming
programming language,
NumPy package
Array basics
for versatile, powerful, elegant and clear programming,
Linear algebra
with a general, high-level, multi-platform application scope,
Data formats and
handling
Pandas package
which is being used very successfully in the data science sector
Series and very much in trend.
DataFrame
Import/Export data

Visual
Moreover, Python is relatively easy to learn and its successful language
illustrations
Matplotlib package
design supports novices to professional developers. Much of Python’s
Figures and subplots success is due to a high degree of standardization and a huge community
Plot types and styles
Pandas layers
that elaborates and collectively recognizes conventions and paradigms.
Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
A short history of time 17
Essential
concepts
Getting started ... of the Python era:
Procedural
programming
Object-orientation The language was originally developed in 1991 by Guido van Rossum.
Numerical Its name was based on Monty Python’s Flying Circus. Its main identifi-
programming
NumPy package cation feature is the novel markup of code blocks – by indentation:
Array basics
Linear algebra
Indentation example
Data formats and
handling password = input("I am your bank. Password please: ")
Pandas package
Series ## I am your bank. Password please: sparkasse
DataFrame
Import/Export data if password == "sparkasse":
Visual print("You successfully logged in!")
illustrations else:
Matplotlib package
print("Fail. Will call the police!")
Figures and subplots
Plot types and styles
Pandas layers ## You successfully logged in!
Applications
Time series
Moving window
This increases the readability of code and should at the same time
Financial applications encourage the programmer in programming neatly. Since the source
Optimization
code can be written more compactly with Python, an increased efficiency
in daily work can be expected.
© 2020 PyEcon.org
A short history of time 18
Essential
concepts
Getting started Overview of the Python development by versions and dates:
Procedural
programming
Object-orientation

Numerical
programming
1990 1995 2000 2005 2010 2015 2020 2025
NumPy package
Array basics
Linear algebra

Data formats and


handling
Pandas package
Series
Python’s birthday Python 2.0 Python 3.0
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Python 2.7 lives forever Python 3.9
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Python 3.6 Python 2.7 will die
Financial applications
Optimization

© 2020 PyEcon.org
In comparison 19
Essential
concepts
Getting started Comparing the way Python works with common programming languages,
Procedural
programming we briefly discuss a selection of popular competitors:
Object-orientation

Numerical C/C++:
programming
NumPy package CPython is interpreted, not compiled.
Array basics
Linear algebra C/C++ are strongly static, complex languages.
Data formats and
handling Java:
Pandas package
Series CPython is not compiled just-in-time.
DataFrame
Import/Export data Java has a C-type syntax.
Visual
illustrations
MATLAB
Matplotlib package
Figures and subplots
In Python you primarily follow a scalar way of thinking, while in
Plot types and styles MATLAB you write matrix-based programs.
Pandas layers

Applications In the numerical context, the matrix view and syntax are very
Time series
similar to those of MATLAB.
Moving window
Financial applications
MATLAB is partially compiled just-in-time.
Optimization

Where CPython is the reference implementation – the “Original Python”,


© 2020 PyEcon.org
which is implemented in C itself.
In comparison 20
Essential
concepts
Getting started R
Procedural
programming
Object-orientation
In Python you primarily follow a scalar way of thinking, while in R
Numerical
you write vector-based programs.
programming
NumPy package R has a C-type syntax including additions to novel language con-
Array basics
Linear algebra
cepts.
Data formats and Stata
handling
Pandas package Any comparison would inadequately describe the differences.
Series
DataFrame
Import/Export data Reference semantics
Visual
illustrations An extremely important difference between the first two languages,
Matplotlib package
Figures and subplots
C/C++ and Java, as well as Python itself, and the last three languages
Plot types and styles is that they follow a call-by-reference semantic, while MATLAB, R and
Pandas layers

Applications
Stata are call-by-copy.
Time series
Moving window Further specific differences and similarities to MATLAB and R will be
Financial applications
Optimization addressed in other parts of this course.

© 2020 PyEcon.org
Versatility – diversity 21
Essential
concepts
Getting started Python has become extremely popular:
Procedural
programming
Object-orientation

Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

Source: https://stackoverflow.blog/2017/09/06/incredible-growth-python/
© 2020 PyEcon.org
Versatility – diversity 22
Essential
concepts
Getting started So, you’re on the right track – because who wants to bet on the wrong
Procedural
programming hoRse?
Object-orientation

Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

Source: https://stackoverflow.blog/2017/09/06/incredible-growth-python/
© 2020 PyEcon.org
Versatility – diversity 23
Essential
concepts
Getting started Areas in which Python is used with great success:
Procedural
programming
Object-orientation Scripts,
Numerical Console applications,
programming
NumPy package GUI applications,
Array basics
Linear algebra
Game development,
Data formats and Website development, and
handling
Pandas package
Numerical programming.
Series
DataFrame Places where Python is used:
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Yet another outline 24
Essential
concepts
Getting started In this course we will successively gain the following insights:
Procedural
programming
Object-orientation

Numerical
programming
1 General basics of the language.
NumPy package
Array basics
Linear algebra
2 Numerical programming and handling of data sets.
Data formats and
handling 3 Application to economic and analytical questions.
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Section 1.2 25
Essential
concepts
Getting started
Procedural
programming
Object-orientation
Essential concepts
Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


I Procedural programming
handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
The first program 26
Essential
concepts
Getting started Programs can be implemented very quickly – this is a pretty minimal
Procedural
programming example. You can write this command to a text file of your choice and
Object-orientation
run it directly on your system:
Numerical
programming
NumPy package Hello there
Array basics
Linear algebra
print("Hello there!")
Data formats and
handling ## Hello there!
Pandas package
Series

Only one function print() (shown here as a keyword),


DataFrame
Import/Export data

Visual
illustrations Function displays argument (a string) on screen,
Matplotlib package
Figures and subplots
Arguments are passed to the function in parentheses,
A string must be wrapped in " " or ’ ’,
Plot types and styles
Pandas layers

Applications
No semicolon at the end.
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
User input 27
Essential
concepts
Getting started Let’s add a user input to the program:
Procedural
programming
Object-orientation Hello you
name = input("Please enter your name: ")
Numerical
programming
NumPy package
Array basics
## Please enter your name: Angela Merkel
Linear algebra
print("Hello " + name + "!")
Data formats and
handling
Pandas package
## Hello Angela Merkel!
Series
DataFrame
Import/Export data

Visual
The function input() is used for interactive text input,
You can use the equal sign = to assign variables (here: name),
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Strings can be joined by the (overloaded) Operator +.
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Determining weekdays 28
Essential
concepts
Getting started We are now trying to find out on which weekday a person was born
Procedural
programming (Merkel’s birthday is 17-07-1954):
Object-orientation

Numerical
programming
Weekday of birth
NumPy package
from datetime import datetime
Array basics
Linear algebra answer = input("Your birthday (DD-MM-YYYY): ")
Data formats and
handling ## Your birthday (DD-MM-YYYY): 17-07-1954
Pandas package
Series birthday = datetime.strptime(answer, "%d-%m-%Y")
DataFrame print("Your birthday was on a " + birthday.strftime("%A") + "!")
Import/Export data

Visual ## Your birthday was on a Saturday!


illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers
It is really easy to import functionality from other modules,
Applications Function strptime() is a method of class datetime,
Time series
Moving window Both methods, strptime() and strftime(), are used to convert
Financial applications
Optimization between strings and date time specifications.

© 2020 PyEcon.org
Time since birth 29
Essential
concepts
Getting started And how many days have passed since then (until Merkel’s 4th swearing-
Procedural
programming in as Federal Chancellor)?
Object-orientation

Numerical
programming
Age in days
NumPy package
someday = datetime.strptime("14-03-2018", "%d-%m-%Y")
Array basics
Linear algebra
print("You are " + str((someday - birthday).days) + " days old!")
Data formats and
handling ## You are 23251 days old!
Pandas package
Series
DataFrame
Import/Export data You can create time differences, i.e., the operator - is overloaded,
Visual
illustrations The difference represents a new object, with its own attributes,
Matplotlib package
Figures and subplots
such as days,
Plot types and styles
Pandas layers
When using the overloaded operator +, you have to explicitly
Applications convert the number of days by means of str() into a string.
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Time since birth 30
Essential
concepts
Getting started How many years, weeks and days do you think that is?
Procedural
programming
Object-orientation Human readable age
Numerical
programming
from dateutil.relativedelta import relativedelta
NumPy package delta = relativedelta(someday, birthday)
Array basics print(f"That’s {delta.years} years, {delta.months} months "
Linear algebra
f"and {delta.days} days!!")
Data formats and
handling
Pandas package
## That's 63 years, 7 months and 25 days!!
Series
DataFrame
Import/Export data

Visual
You don’t have to keep reinventing the wheel – a wealth of packages
illustrations and individual modules are freely available,
Matplotlib package
Figures and subplots
A lowercase f before "..." provides convenient formatting – there
Plot types and styles
Pandas layers are other options as well,
Applications
Time series
Two strings in sequence are implicitly joined together – "That"
Moving window "’s nice"!
Financial applications
Optimization

© 2020 PyEcon.org
Getting help 31
Essential
concepts
Getting started When working with the interactive interpreter, i.e., in a notebook, you
Procedural
programming can quickly get useful information about Python objects:
Object-orientation

Numerical
programming
Help system
NumPy package
help(len)
Array basics
Linear algebra
## Help on built-in function len in module builtins:
Data formats and
handling ##
Pandas package ## len(obj, /)
Series
## Return the number of items in a container.
DataFrame
Import/Export data

Visual Alternatively, e.g., for more complex problems, it is best to search


illustrations
Matplotlib package
directly with your preferred internet search engine.
Figures and subplots
Plot types and styles
You can find neat solutions to conventional challenges in literature.
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Lexical structure 32
Essential
concepts
Getting started As with natural language, programming languages have a lexical struc-
Procedural
programming ture. Source code consists of the smallest possible, indivisible elements,
Object-orientation
the tokens. In Python you can find the following groups of elements:
Numerical
programming
NumPy package Literals
Array basics
Linear algebra
Variables
Data formats and
handling Operators
Pandas package
Series Delimiters
DataFrame
Import/Export data Keywords
Visual
illustrations Comments
Matplotlib package
Figures and subplots
Plot types and styles These terms give us a rock-solid foundation for exploring the heart of
Pandas layers
a programming language.
Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Literals and variables 33
Essential
concepts
Getting started Basically, we distinguish between literals and variables:
Procedural
programming
Object-orientation Assigning variables with literals
Numerical
programming
myint = 7
NumPy package myfloat = 4.0
Array basics myboat = "nice"
Linear algebra
mybool = True
Data formats and
handling
myfloat = myboat
Pandas package
Series
DataFrame
Import/Export data In this course, we will work with four different literals: integer (7),
Visual float (4.0), string ("nice") and boolean (True),
illustrations
Matplotlib package Literals are assigned to variables at runtime,
Figures and subplots
Plot types and styles In Python the data type is derived from the literal and does not
Pandas layers

Applications
have to be described explicitly,
Time series
Moving window
It is allowed to assign values of different data types to the same
Financial applications variable (name) sequentially,
Optimization

If we don’t assign a literal to any variables, we forfeit it.


© 2020 PyEcon.org
Operators and delimiters 34
Essential
concepts
Getting started Most operators and delimiters will be introduced to you during this
Procedural
programming course. Here is an overview of the operators:
Object-orientation

Numerical
programming
Overview of operators
NumPy package
## + - * / ** //
## % @ << >> & |
Array basics
Linear algebra

Data formats and


## ^ ~ == != < >
handling ## <= >= and or not in
Pandas package ## not in is is not
Series
DataFrame
Import/Export data An overview of the delimiters follows:
Visual
illustrations Overview of delimiters
Matplotlib package
Figures and subplots ## ( ) [ ] { }
Plot types and styles ## , : . = ; ->
Pandas layers
## += -= *= /= **= //=
Applications ## %= @= <<= >>= &= |=
Time series
Moving window
## ^= ' " \ @ SPACE
Financial applications
Optimization

© 2020 PyEcon.org
Arithmetic operators 35
Essential
concepts
Getting started All regular arithmetic operations involving numbers are possible:
Procedural
programming
Object-orientation Pocket calculator
Numerical 10 + 5
programming
NumPy package
100 - 20
Array basics 8 / 2
Linear algebra 4 * (10 + 20)
Data formats and 2**3
handling
Pandas package ## 15
Series ## 80
DataFrame
## 4.0
Import/Export data
## 120
Visual
illustrations ## 8
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers The result of dividing two integers is a floating point number,
Applications
Time series
The conventional rules apply: Parentheses first, then multiplication
Moving window and division, etc.,
Financial applications
Optimization The operator ** is used for exponentiation.

© 2020 PyEcon.org
Boolean operators 36
Essential
concepts
Getting started In order to demonstrate the use of logical operators (and formatted
Procedural
programming strings and for-loops), we create a handy table summarizing some
Object-orientation
important results from boolean algebra:
Numerical
programming
NumPy package Logical table
Array basics
Linear algebra # Create table head
Data formats and print("a b a and b a or b not a\n"
handling "--------------------------------")
Pandas package
Series
DataFrame # Loop through the rows
Import/Export data for a in [False, True]:
Visual for b in [False, True]:
illustrations
print(f"{a:1} {b:3} {a and b:6} {a or b:8} {not a:7}")
Matplotlib package
Figures and subplots ## a b a and b a or b not a
Plot types and styles
Pandas layers
## --------------------------------
## 0 0 0 0 1
Applications
Time series
## 0 1 0 1 1
Moving window ## 1 0 0 1 0
Financial applications ## 1 1 1 1 0
Optimization

© 2020 PyEcon.org
Keywords and comments 37
Essential
concepts
Getting started The programmer explains the structure of his/her program to the
Procedural
programming interpreter via a restricted set of short commands, the keywords:
Object-orientation

Numerical Overview of keywords


programming
NumPy package ## and as assert break class continue
Array basics ## def del elif else except False
## finally for from global if import
Linear algebra

Data formats and


handling
## in is lambda None nonlocal not
Pandas package ## or pass raise return True try
Series ## while with yield
DataFrame
Import/Export data

Visual
There are two ways to make comments:
illustrations
Matplotlib package Provide some comments
Figures and subplots
Plot types and styles # Set variable to something - or nothing?
Pandas layers something = None
Applications
Time series """
Moving window
Financial applications
I am a docstring!
Optimization A multiline string comment hybrid.
I will be useful for describing classes and methods.
"""
© 2020 PyEcon.org
Data types 38
Essential
concepts
Getting started Python offers the following basic data types, which we will use in this
Procedural
programming course:
Object-orientation

Numerical Data type Description


programming
NumPy package
int() Integers
Array basics float() Floating point numbers
Linear algebra

Data formats and


str() Strings, i.e., unicode (UTF-8) texts
handling
bool() Boolean, i.e., True or False
Pandas package
Series list() List, an ordered array of objects
tuple()
DataFrame
Import/Export data
Tuple, an ordered, unmutable array of objects
Visual dict() Dictionary, an unordered, associative array of objects
set()
illustrations
Matplotlib package Set, an unordered array/set of objects
Figures and subplots
Plot types and styles
None() Nothing, emptyness, the void..
Pandas layers

Applications
Each data type has its own methods, that is, functions that are appli-
Time series cable specifically to an object of this type.
Moving window
Financial applications You will gradually get to know new and more complex data types or
Optimization
object classes.
© 2020 PyEcon.org
Lists 39
Essential
concepts
Getting started A list is an ordered array of objects, accessible via an index:
Procedural
programming
Object-orientation Listing tech companies
stocks = ["Google", "Amazon", "Facebook", "Apple"]
Numerical
programming
NumPy package stocks[1]
Array basics
stocks.append("Twitter")
stocks.insert(2, "Microsoft")
Linear algebra

Data formats and


handling
stocks.sort()
Pandas package
## ['Google', 'Amazon', 'Facebook', 'Apple']
## Amazon
Series
DataFrame
Import/Export data ## ['Google', 'Amazon', 'Facebook', 'Apple', 'Twitter']
Visual ## ['Google', 'Amazon', 'Microsoft', 'Facebook', 'Apple', 'Twitter']
illustrations ## ['Amazon', 'Apple', 'Facebook', 'Google', 'Microsoft', 'Twitter']
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers
The constructor for new lists is [ ],
Applications
Time series The first element has the index 0,
Moving window
Financial applications The data type list() possesses its own methods.
Optimization

© 2020 PyEcon.org
Tuples 40
Essential
concepts
Getting started Tuples are immutable sequences related to lists that cannot be extended,
Procedural
programming for example. The drawbacks in flexibility are compensated by the
Object-orientation
advantages in speed and memory usage:
Numerical
programming
NumPy package Selecting elements in sequences
Array basics
Linear algebra lottery = (1, 8, 9, 12, 24, 28)
Data formats and len(lottery)
handling lottery[1:3]
lottery[:4]
Pandas package
Series
DataFrame lottery[-1]
Import/Export data lottery[-2:]
Visual
illustrations ## (1, 8, 9, 12, 24, 28)
Matplotlib package ## 6
Figures and subplots
## (8, 9)
Plot types and styles
Pandas layers
## (1, 8, 9, 12)
Applications
## 28
Time series ## (24, 28)
Moving window
Financial applications
Optimization The same operations are also supported when using lists.

© 2020 PyEcon.org
Dictionaries 41
Essential
concepts
Getting started Dictionaries are associative collections of key-value pairs. The key must
Procedural
programming be immutable and unique:
Object-orientation

Numerical
programming
Internet slang dictionary
NumPy package
slang = {"imho": "in my humble opinion",
"lol": "laughing out loud",
Array basics
Linear algebra

Data formats and


"tl;dr": "too long; didn’t read"}
handling slang["lol"]
Pandas package slang["gl&hl"] = "good luck & have fun"
Series
DataFrame
slang.keys()
Import/Export data slang.values()
Visual
illustrations
## {'imho': 'in...ion', 'lol': 'la...oud', 'tl;dr': 'to...ead'}
Matplotlib package ## laughing out loud
Figures and subplots ## good luck & have fun
Plot types and styles
## dict_keys(['imho', 'lol', 'tl;dr', 'gl&hl'])
## dict_values([... & have fun'])
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization
The constructor for dict() is { } with :,
The pairs are unordered, iterable sequences.
© 2020 PyEcon.org
Sets 42
Essential
concepts
Getting started A set is an unordered collection of objects without duplicates:
Procedural
programming
Object-orientation Set operations
x = {"o", "n", "y", "t"}
Numerical
programming
NumPy package y = {"p", "h", "o", "n"}
Array basics
x & y
x | y
Linear algebra

Data formats and


handling
x - y
Pandas package
Series
## {'t', 'n', 'o', 'y'}
DataFrame ## {'o', 'p', 'n', 'h'}
Import/Export data ## {'n', 'o'}
Visual ## {'p', 'h', 't', 'n', 'o', 'y'}
illustrations
## {'t', 'y'}
Matplotlib package
Figures and subplots
Plot types and styles

The constructor for set() is { },


Pandas layers

Applications
Time series
Defines its own operators that overload existing ones.
Moving window
Financial applications
Optimization
Empty set via set(), because {} already creates dict().

© 2020 PyEcon.org
Comparison operators 43
Essential
concepts
Getting started The <, <=, >, >=, ==, != operators compare the values of two objects
Procedural
programming and return True or False.
Object-orientation

Numerical Op. True, only if the value of the left operand is


programming
NumPy package
< less than the value of the right operand
Array basics <= less than or equal to the value of the right operand
Linear algebra

Data formats and


> greater than the value of the right operand
handling
>= greater than or equal to the value of the right operand
Pandas package
Series == equal to the right operand
DataFrame
Import/Export data
!= not equal to the right operand
Visual
illustrations The comparison depends on the datatype of the objects. For example
Matplotlib package
Figures and subplots
"7" == 7 will return False, while 7.0 == 7 will return True.
Plot types and styles
Pandas layers
Numbers are compared arithmetically.
Applications Strings are compared lexicographically.
Time series
Moving window Tuples and lists are compared lexicographically using comparison
Financial applications
Optimization of corresponding elements. This behaviour can be altered.

© 2020 PyEcon.org
Comparison operators 44
Essential
concepts
Getting started
Procedural Comparing examples
programming
Object-orientation x, y = 5, 8
Numerical print("x < y is", x < y)
programming
NumPy package
## x < y is True
Array basics
Linear algebra
print("x > y is", x > y)
Data formats and
handling
Pandas package ## x > y is False
Series
DataFrame print("x == y is", x == y)
Import/Export data

Visual ## x == y is False
illustrations
Matplotlib package
Figures and subplots
print("x != y is", x != y)
Plot types and styles
Pandas layers ## x != y is True
Applications
Time series print("This is", "Name" == "Name", "and not", "Name" == "name")
Moving window
Financial applications
## This is True and not False
Optimization

Comparing strings, the case has to be considered.


© 2020 PyEcon.org
Chaining comparison operators 45
Essential
concepts
Getting started In Python, comparison operators can also be chained.
Procedural
programming
Object-orientation Chaining comparison examples
Numerical
programming
x = 5
NumPy package
Array basics 5 >= x > 4
Linear algebra

Data formats and ## True


handling
Pandas package
Series
12 < x < 20
DataFrame
Import/Export data ## False
Visual
illustrations 2 < x < 10
Matplotlib package
Figures and subplots
Plot types and styles
## True
Pandas layers

Applications
2 < x and x < 10 # unchained expression
Time series
Moving window ## True
Financial applications
Optimization
The comparison is performed for both sides and combined by and.

© 2020 PyEcon.org
Logical operators 46
Essential
concepts
Getting started There are three logical operators: not, and, or.
Procedural
programming
Object-orientation Op. Description
Numerical not x Returns True only if x is False
programming
NumPy package x and y Returns True only if x and y are True
Array basics
Linear algebra
x or y Returns True only if x or y or both are True
Data formats and
handling
Pandas package Logical operators examples
Series
DataFrame x, y = 5, 8
Import/Export data

Visual (x == 5) and (y == 9)
illustrations

## False
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers (x == 5) or (y == 8)
Applications
Time series ## True
Moving window
Financial applications not(x == 4) or (y == 9)
Optimization

## True
© 2020 PyEcon.org
Exclusive or 47
Essential
concepts
Getting started In some situations, you need a logical operation that is True only when
Procedural
programming the operands differ (one is True, the other is False). This task can
Object-orientation
be solved by using the logical operators not, and, or or simply !=.
Numerical
programming
NumPy package Exclusive or
x, y = 5, 8
Array basics
Linear algebra

Data formats and


handling ((x == 5) and not (y == 8)) or (not (x == 5) and (y == 8))
Pandas package
Series
## False
DataFrame
Import/Export data
x = 4
Visual
illustrations ((x == 5) and not (y == 8)) or (not (x == 5) and (y == 8))
Matplotlib package
Figures and subplots ## True
Plot types and styles
Pandas layers
(x == 5) != (y == 8)
Applications
Time series
## True
Moving window
Financial applications
Optimization In many other programming languages, an operation “exclusive or” or
xor is explicitly part of the language, but not in Python.
© 2020 PyEcon.org
Binary numbers 48
Essential
concepts
Getting started Bitwise operators operate on numbers, but instead of treating that
Procedural
programming number as if it were a single (decimal) value, they operate on the string
Object-orientation
of bits representation, written in binary. A binary number is a number
Numerical
programming expressed in the base-2 numeral system, also called binary numeral
NumPy package
Array basics
system, which consists of only two distinct symbols: typically 0 (zero)
Linear algebra and 1 (one).
Data formats and
handling
Pandas package
Binary numbers
## Decimal: Binary:
Series
DataFrame
Import/Export data ## 0: 0
Visual ## 1: 1
illustrations ## 2: 10
Matplotlib package
## 3: 11
Figures and subplots
Plot types and styles
## 4: 100
Pandas layers ## 5: 101
Applications ## 6: 110
Time series ## 7: 111
Moving window
## 8: 1000
Financial applications
Optimization
## 9: 1001
## 10: 1010

© 2020 PyEcon.org
Binary numbers 49
Essential
concepts
Getting started How to convert binary numbers to integers (the unknown keywords and
Procedural
programming language structures will be introduced soon):
Object-orientation

Numerical
programming
Binary to integer
NumPy package
def bintoint(binary):
Array basics
Linear algebra
binary = binary[::-1]
Data formats and
num = 0
handling for i in range(len(binary)):
Pandas package
num += int(binary[i]) * 2**i
return num
Series
DataFrame
Import/Export data

Visual bintoint("1101001")
illustrations
Matplotlib package ## 105
Figures and subplots

int("1101001", 2)
Plot types and styles
Pandas layers
# compare with built-in function
Applications
Time series
## 105
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Binary numbers 50
Essential
concepts
Getting started How to convert integers to binary numbers:
Procedural
programming
Object-orientation
Integers to binary
Numerical def inttobin(num):
programming
NumPy package
binary = ""
Array basics if num != 0:
Linear algebra
while num >= 1:
Data formats and if num % 2 == 0:
handling
Pandas package
binary += "0"
Series num = num / 2
DataFrame else:
Import/Export data
binary += "1"
Visual
illustrations
num = (num - 1) / 2
Matplotlib package else:
Figures and subplots binary = "0"
Plot types and styles
return binary[::-1]
Pandas layers
inttobin(105)
Applications

## '1101001'
Time series
Moving window
Financial applications
Optimization bin(105)[2:] # compare with built-in function

## '1101001'
© 2020 PyEcon.org
Bitwise operators 51
Essential
concepts
Getting started Python offers distinct bitwise operators. Some of them will be redefined
Procedural
programming entirely different by extensions, such as, e. g., vectorization.
Object-orientation

Numerical Bit. op. Description


programming
NumPy package
x >> y Returns x with the bits shifted to the left by y places
Array basics x << y Returns x with the bits shifted to the right by y places
Linear algebra

Data formats and


x&y Does a bitwise and
handling
Pandas package
x|y Does a bitwise or
Series ~x Returns the complement of x
x^y
DataFrame
Import/Export data Does a bitwise exclusive or
Visual
illustrations Bitwise operators
Matplotlib package
Figures and subplots a, b = 5, 7
Plot types and styles
c = a & b # bitwise and
Pandas layers

Applications
## a: 101
Time series ## b: 111
Moving window ## c: 101
Financial applications
Optimization print(c)

## 5
© 2020 PyEcon.org
Bitwise operators 52
Essential
concepts
Getting started
Procedural Bitwise operators
programming
Object-orientation a, b = 5, 7
Numerical c = a | b # bitwise or
programming
NumPy package
## a: 101
Array basics ## b: 111
Linear algebra
## c: 111
Data formats and
handling print(c)
Pandas package
Series ## 7
DataFrame
Import/Export data
a = 13
Visual
b = a << 2 # bitwise shift
illustrations
Matplotlib package
## a: 1101
Figures and subplots ## b: 110100
Plot types and styles
a, b = 35, 37
Pandas layers
c = a ^ b # bitwise exclusive or
Applications
Time series ## a: 100011
Moving window ## b: 100101
Financial applications
## c: 000110
Optimization

© 2020 PyEcon.org
Control flow: Conditional statements 53
Essential
concepts
Getting started Python has only one kind of conditional statement – if-elif-else:
Procedural
programming
Object-orientation Computer data sizes
bytes = 100000000 / 8 # e.g. DSL 100000
Numerical
programming
NumPy package if bytes >= 1e9:
Array basics
print(f"{bytes/1e9:6.2f} GByte")
elif bytes >= 1e6:
Linear algebra

Data formats and


handling
print(f"{bytes/1e6:6.2f} MByte")
Pandas package elif bytes >= 1e3:
Series print(f"{bytes/1e3:6.2f} KByte")
DataFrame
Import/Export data
else:
print(f"{bytes:6.2f} Byte")
Visual
illustrations
Matplotlib package ## 12.50 MByte
Figures and subplots
Plot types and styles
Pandas layers Control flow structures may be nested in any order:
Applications
Time series Nestings
Moving window
Financial applications if a > 1:
Optimization if b > 2:
pass # a special keyword for empty blocks
© 2020 PyEcon.org
Control flow: The for loop 54
Essential
concepts
Getting started In Python there exist two conventional program loops – for-in-else:
Procedural
programming
Object-orientation Total sum
Numerical
programming
numbers = [7, 3, 4, 5, 6, 15]
NumPy package
y = 0
Array basics for i in numbers:
Linear algebra
y += i
Data formats and print(f"The sum of ’numbers’ is {y}.")
handling
Pandas package
Series
## The sum of 'numbers' is 40.
DataFrame
Import/Export data
Lists or other collections can also be created dynamically:
Visual
illustrations
Matplotlib package Powers of 2
powers = [2 ** i for i in range(11)]
Figures and subplots
Plot types and styles
Pandas layers teacher = ["***", "**", "*"]
Applications grades = {star: len(teacher) - len(star) + 1 for star in teacher}
## [1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024]
Time series
Moving window
Financial applications ## {'***': 1, '**': 2, '*': 3}
Optimization

© 2020 PyEcon.org
Control flow: continue and break 55
Essential
concepts
Getting started Loops can skip iterations (continue):
Procedural
programming
Object-orientation Continue the loop
Numerical
programming
for x in ["a", "b", "c"]:
NumPy package a = x.upper()
Array basics continue
Linear algebra
print(x)
Data formats and print(a)
handling
Pandas package
Series ## C
DataFrame
Import/Export data
Or a loop can be aborted instantly (break):
Visual
illustrations
Matplotlib package Breaking the habit
Figures and subplots
Plot types and styles y = 0
Pandas layers for i in [7, 3, 4, "x", 6, 15]:
Applications if not isinstance(i, int):
Time series break
Moving window
y += i
Financial applications
Optimization
print(f"The total sum is {y}.")

## The total sum is 14.


© 2020 PyEcon.org
Control flow: The while loop 56
Essential
concepts
Getting started For loops where the number of iterations is not known at the beginning,
Procedural
programming you use while-else.
Object-orientation

Numerical
Have you already noticed the keyword else? Python only executes the
programming
NumPy package
branch if it was not terminated by break:
Array basics
Linear algebra Favorite lottery number
Data formats and
handling
import random
Pandas package n = 0
Series favorite = 7
DataFrame
Import/Export data
while n < 100:
n += 1
Visual
illustrations draw = random.randint(1, 49) # e.g. German lottery
Matplotlib package if draw == favorite:
Figures and subplots
print("Got my number! :)")
Plot types and styles
Pandas layers
break
Applications
else:
Time series print("My favorite did not show up! :(")
Moving window print(f"I tried {n} times!")
Financial applications
Optimization ## Got my number! :)
## I tried 10 times!
© 2020 PyEcon.org
Functions 57
Essential
concepts
Getting started Functions are defined using the keyword def. The structure of function
Procedural
programming signature and body is specified by indentation, too:
Object-orientation

Numerical
programming
Drawing lottery numbers
NumPy package
def draw_sample(n, first=1, last=49):
numbers = list(range(first, last + 1))
Array basics
Linear algebra

Data formats and


sample = []
handling for i in range(n):
Pandas package ind = random.randint(0, len(numbers) - 1)
Series
DataFrame
sample.append(numbers.pop(ind))
Import/Export data sample.sort()
Visual
return sample
illustrations
Matplotlib package draw_sample(6)
Figures and subplots draw_sample(6, 80, 100)
Plot types and styles
draw_sample(3, first=5)
Pandas layers

Applications ## [2, 3, 4, 16, 23, 28]


Time series ## [82, 84, 94, 95, 99, 100]
Moving window
## [5, 12, 16]
Financial applications
Optimization

© 2020 PyEcon.org
Functions 58
Essential
concepts
Getting started Functions are of type callable(), defined as closures, and can be
Procedural
programming created and used like other objects:
Object-orientation

Numerical Prime numbers


programming
NumPy package def primes(n):
Array basics numbers = [2]
Linear algebra

Data formats and def is_prime(num):


handling
Pandas package
for i in numbers:
Series if num % i == 0:
DataFrame return False
Import/Export data
return True
Visual
illustrations
if n == 2:
Matplotlib package return numbers
Figures and subplots for i in range(3, n + 1):
Plot types and styles
if is_prime(i):
Pandas layers
numbers.append(i)
Applications
Time series
return numbers
Moving window primes(50)
Financial applications
Optimization ## [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47]

© 2020 PyEcon.org
Seems weird? We discuss namespaces in the next section.
Section 1.3 59
Essential
concepts
Getting started
Procedural
programming
Object-orientation
Essential concepts
Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


I Object-orientation
handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Python is object-oriented 60
Essential
concepts
Getting started There are three widely known programming paradigms: procedural,
Procedural
programming functional and object-oriented programming (OOP). Python supports
Object-orientation
them all.
Numerical
programming
NumPy package You have learned how to handle predefined data types in Python.
Array basics
Linear algebra
Actually, we have already encountered classes and instances, take for
Data formats and
example dict().
handling
Pandas package In this section you will learn the basics of dealing with (your own)
Series
DataFrame
classes:
Import/Export data
1 References
Visual
illustrations 2 Classes
Matplotlib package
Figures and subplots 3 Instances
Plot types and styles
Pandas layers 4 Main principles
Applications
Time series 5 Garbage collection
Moving window
Financial applications OOP is a wide field and challenging for beginners. Don’t get discouraged
Optimization
and, if you find deficits in yourself, read the literature.

© 2020 PyEcon.org
References 61
Essential
concepts
Getting started When you assign a variable, a reference to an object is set:
Procedural
programming
Object-orientation Equal but not identical
Numerical
programming a = ["Star", "Trek"]
NumPy package b = ["Star", "Trek"]
Array basics c = a
Linear algebra
a == b
Data formats and
handling
a == c
Pandas package a is b
Series a is c
DataFrame
Import/Export data ## ['Star', 'Trek']
Visual
## ['Star', 'Trek']
illustrations ## ['Star', 'Trek']
Matplotlib package
## True
Figures and subplots
Plot types and styles
## True
Pandas layers ## False
Applications ## True
Time series
Moving window
Financial applications
Optimization Two equal but not identical objects are created,
Variables a and c link to the same object.
© 2020 PyEcon.org
Copying objects 62
Essential
concepts
Getting started When we introduced lists, we initially did not mention that they are a
Procedural
programming first-class example of mutable objects:
Object-orientation

Numerical Collecting grades


programming
NumPy package grades = [1.7, 1.3, 2.7, 2.0]
Array basics
result = grades.append(1.0)
Linear algebra
result
Data formats and
handling grades
Pandas package finals = grades
Series finals.remove(2.7)
DataFrame
Import/Export data
finals
grades
Visual
illustrations
## None
Matplotlib package
Figures and subplots
## [1.7, 1.3, 2.7, 2.0, 1.0]
Plot types and styles ## [1.7, 1.3, 2.0, 1.0]
Pandas layers ## [1.7, 1.3, 2.0, 1.0]
Applications
Time series
Moving window
Financial applications
Modifications can be in-place – the object itself is modified.
Optimization
Changing an object that is referenced several times could cause
(un)intended consequences.
© 2020 PyEcon.org
Side effects 63
Essential
concepts
Getting started In Python, arguments are passed by assignment, i.e., call-by-reference:
Procedural
programming
Object-orientation Side effects
Numerical def last_element(x):
programming
NumPy package
return x.pop(-1)
Array basics
Linear algebra
a = stocks
Data formats and last_element(a)
handling
Pandas package
a
Series
## ['Amazon', 'Apple', 'Facebook', 'Google', 'Microsoft', 'Twitter']
## Twitter
DataFrame
Import/Export data
## ['Amazon', 'Apple', 'Facebook', 'Google', 'Microsoft']
Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles There are side effects,
Pandas layers

Applications
Referenced mutable objects might be modified,
Time series
Moving window
Referenced immutable objects might be copyied.
Financial applications
Optimization

© 2020 PyEcon.org
Copying objects 64
Essential
concepts
Getting started We are able to make an exact copy of the object:
Procedural
programming
Object-orientation Copying
Numerical
programming def last_element(x):
NumPy package y = x.copy()
Array basics
return y.pop(-1)
Linear algebra

Data formats and


handling
a = stocks
Pandas package last_element(a)
Series a
DataFrame
Import/Export data ## ['Amazon', 'Apple', 'Facebook', 'Google', 'Microsoft']
Visual ## Microsoft
illustrations ## ['Amazon', 'Apple', 'Facebook', 'Google', 'Microsoft']
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers
We receive a new object,
Applications
Time series The new object is not identical to the old one.
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Deep and shallow copying 65
Essential
concepts
Getting started However, keep in mind that, in most cases, a method copy() will
Procedural
programming create shallow copys while only deep copying will duplicate also the
Object-orientation
contents of a mutable object with a complex structure:
Numerical
programming
NumPy package Cloning fast food
Array basics
Linear algebra fastfood = [["burgers", "hot dogs"], ["pizza", "pasta"]]
Data formats and italian = fastfood.copy()
handling italian.pop(0)
american = list(fastfood)
Pandas package
Series
DataFrame american.pop(1)
Import/Export data american[0] = american[0].copy()
Visual fastfood[0][1] = "chicken wings"
illustrations
fastfood[1][0] = "risotto"
Matplotlib package
Figures and subplots
italian
Plot types and styles american
Pandas layers
## [['risotto', 'pasta']]
Applications
Time series
## [['burgers', 'hot dogs']]
Moving window
Financial applications
Optimization
Both approaches, copy() and list(), create new list objects con-
taining new references to the original sub-lists. But for a deep copy,
© 2020 PyEcon.org
you have to recursively create duplicates of all its objects.
Classes 66
Essential
concepts
Getting started In Python everything is an object and more complex objects consist of
Procedural
programming several other objects.
Object-orientation

Numerical In the OOP, we create objects according to patterns. These kinds of


programming
NumPy package blueprints are called classes and are characterized by two categories of
Array basics
Linear algebra
elements:
Data formats and
handling
Attributes:
Pandas package Variables that represent the properties of
Series
DataFrame an object, object attributes, or
Import/Export data

Visual a class, named class attributes.


illustrations
Matplotlib package Methods:
Figures and subplots
Plot types and styles
Functions that are defined within a class:
Pandas layers
(non-static) methods can access all attributes, while
Applications
Time series
static methods can only access class attributes.
Moving window
Financial applications
Optimization Every generated object is an instance of such a construction plan.

© 2020 PyEcon.org
Class definition 67
Essential
concepts
Getting started Specifically, we want to create “rectangle object” and define a separate
Procedural
programming Rectangle class for it:
Object-orientation

Numerical
programming
Rectangle class
NumPy package class Rectangle:
width = 0
Array basics
Linear algebra
height = 0
Data formats and
handling
Pandas package def area(self):
Series
return self.width * self.height
DataFrame
Import/Export data

Visual
myrectangle = Rectangle()
illustrations myrectangle.width = 10
Matplotlib package
myrectangle.height = 20
myrectangle.area()
Figures and subplots
Plot types and styles
Pandas layers

Applications
## 200
Time series
Moving window
Financial applications
Optimization
New classes are defined using the keyword class,
The variable self always refers to the instance itself.
© 2020 PyEcon.org
Class constructor 68
Essential
concepts
Getting started We add a constructor (method) __init__(), that is called to initialize
Procedural
programming an object of Rectangle:
Object-orientation

Numerical
programming
Rectangle class with constructor
NumPy package class Rectangle:
width = 0
Array basics
Linear algebra
height = 0
Data formats and
handling
Pandas package def __init__(self, width, height):
Series
self.width = width
DataFrame
Import/Export data self.height = height
Visual
illustrations def area(self):
Matplotlib package
return self.width * self.height
myrectangle = Rectangle(15, 30)
Figures and subplots
Plot types and styles
Pandas layers myrectangle.area()
Applications
Time series ## 450
Moving window
Financial applications
Optimization
In our example, we use the constructor to set the attributes. Methods
with names matching __fun__() have a special, standardized meaning
© 2020 PyEcon.org
in Python.
Class inheritance 69
Essential
concepts
Getting started One of the most important concepts of OOP is inheritance. A class
Procedural
programming inherits all attributes and methods of its parent class and can add new
Object-orientation
or overwrite existing ones:
Numerical
programming
NumPy package Square inherits Rectangle
Array basics
Linear algebra class Square(Rectangle):
Data formats and def __init__(self, length):
handling super().__init__(length, length)
Pandas package
Series
DataFrame def diagonal(self):
Import/Export data return (self.width**2 + self.height**2)**0.5
Visual mysquare = Square(15)
illustrations
Matplotlib package print(f"Area: {mysquare.area()}")
Figures and subplots
print(f"Diagonal length: {mysquare.diagonal():7.4f}")
Plot types and styles
Pandas layers ## Area: 225
Applications ## Diagonal length: 21.2132
Time series
Moving window
Financial applications The methods of the parent class, including the constructor, may be
referenced by super().
Optimization

© 2020 PyEcon.org
Garbage collection 70
Essential
concepts
Getting started You do not have to worry about memory management in Python. The
Procedural
programming garbage collector will tidy up for you.
Object-orientation

Numerical If there are no more references to an object, it is automatically disposed


programming
NumPy package of by the garbage collector:
Array basics
Linear algebra
Garbage collection in action
Data formats and
handling class Dog:
Pandas package
def __del__(self):
print("Woof! The dogcatcher got me! Entering the void.. :(")
Series
DataFrame
Import/Export data # My old dog on a leash
Visual mydog = Dog()
illustrations # A new dog is born
Matplotlib package
Figures and subplots
newdog = Dog()
Plot types and styles # Using my leash for the new dog
Pandas layers mydog = newdog
Applications
Time series ## Woof! The dogcatcher got me! Entering the void.. :(
Moving window
Financial applications
Optimization The destructor __del__() is executed as the last act before an object
gets deleted.
© 2020 PyEcon.org
Error handling in Python 71
Essential
concepts
Getting started Everyone involved in programming will encounter errors of various types.
Procedural
programming These errors can be stressful and annoying but being aware of the basic
Object-orientation
types of errors that can occur will give you the chance to handle them.
Numerical
programming
NumPy package Seeing the line SyntaxError may let you think "oh no, I’ve done every-
Array basics
Linear algebra
thing wrong", but errors are normal and even experienced programmers
Data formats and
face them frequently. Hints on error handling:
handling
Pandas package Dissect the error: Find the line in the error message that is specified.
Series
DataFrame
Many errors have messages that are not important to the actual
Import/Export data error. In Python you often find the important information at the
Visual
illustrations
end of the error message.
Matplotlib package
Figures and subplots
Errors are often oversights: In most cases the error massage will
Plot types and styles give you the line in your code where the error occurred.
Pandas layers

Applications Search the web: If you are not able to fix the errors on your own,
Time series
copy the error message into a search engine and read through
Moving window
Financial applications the results. Probably someone else also had this problem and the
Optimization
community already found a solution.

© 2020 PyEcon.org
Exceptions versus syntax errors 72
Essential
concepts
Getting started A Python program terminates immediately as it encounters an error. In
Procedural
programming Python, errors can be either syntax errors or exceptions. Syntax errors
Object-orientation
occur when the parser detects a wrong sequence in the Python code.
Numerical
programming An arrow indicates the exact position of the syntax error:
NumPy package
Array basics
Syntax Error
Linear algebra

Data formats and ## print("Hello Word"))


handling
Pandas package ## File "<stdin>", line 1
Series ## print("Hello World"))
DataFrame
Import/Export data
## ^
## SyntaxError: invalid syntax
Visual
illustrations
Matplotlib package
Figures and subplots
An exception occurs whenever a syntactically correct Python code
Plot types and styles results in an error:
Pandas layers

Applications Exception
Time series
Moving window
a = 0 / 0
Financial applications
Optimization
## <stdin> in <module>()
## ----> 1 a = 0 / 0
## ZeroDivisionError: division by zero
© 2020 PyEcon.org
Exceptions 73
Essential
concepts
Getting started Exceptions appear in different types and the type is printed as a part
Procedural
programming of the error message. The next example shows three common built-in
Object-orientation
exceptions:
Numerical
programming
NumPy package Frequent exception
Array basics
Linear algebra 0 / 0
Data formats and
handling
## <stdin> in <module>()
Pandas package ## ----> 1 0 / 0
Series ## ZeroDivisionError: division by zero
DataFrame
Import/Export data 3 + a
Visual ## <stdin> in <module>()
illustrations
Matplotlib package
## ----> 1 3 + a
Figures and subplots ## NameError: name 'a' is not defined
Plot types and styles
Pandas layers
3 + "2"
Applications ## <stdin> in <module>()
Time series ## ----> 1 3 + "2"
Moving window
## TypeError: unsupported operand type(s) for +: 'int' and 'str'
Financial applications
Optimization

A list of all exception classes of the standard library can be found here.
© 2020 PyEcon.org
Exception handling 74
Essential
concepts
Getting started When an exception occurs, the Python interpreter throws an error
Procedural
programming message and exits. But in most situations, you do not want your whole
Object-orientation
program to stop.
Numerical
programming
NumPy package The try block can test a block of code for errors.
Array basics
Linear algebra
The except block lets you handle the error.
Data formats and
handling
Pandas package
Try and except
Series
DataFrame
try:
Import/Export data print(abc)
Visual
except:
illustrations print("An exception occurred")
Matplotlib package
Figures and subplots
## An exception occurred
Plot types and styles
Pandas layers

Applications The statement above will raise an error, because the variable abc is
Time series
Moving window
not defined.
Financial applications
Optimization

© 2020 PyEcon.org
Exception handling 75
Essential
concepts
Getting started You can define multiple exception blocks. For example, if you want to
Procedural
programming execute code when you expect a special kind of error to occur:
Object-orientation

Numerical
programming
Multiple exception blocks
NumPy package try:
Array basics
Linear algebra
print(abc)
except NameError:
Data formats and
handling print("Variable abc is not defined")
Pandas package except:
Series
print("Something else went wrong")
DataFrame
Import/Export data
## Variable abc is not defined
Visual
illustrations try:
Matplotlib package
0 / 0
except NameError:
Figures and subplots
Plot types and styles
Pandas layers print("Variable abc is not defined")
Applications except:
Time series print("Something else went wrong")
Moving window
Financial applications
## Something else went wrong
Optimization

© 2020 PyEcon.org
Exception handling 76
Essential
concepts
Getting started Complementary, like for if-else, the else keyword defines a block of
Procedural
programming code to be executed if no errors were thrown:
Object-orientation

Numerical Else exception


programming
NumPy package try:
Array basics print("Hello World")
Linear algebra
except:
Data formats and
handling
print("Something went wrong")
Pandas package else:
Series print("Everything is okay")
DataFrame
Import/Export data
## Hello World
Visual
illustrations
## Everything is okay
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Exception handling 77
Essential
concepts
Getting started The finally block will be executed regardless if the try block raises
Procedural
programming an error or not. Hence, you can make sure the code is run:
Object-orientation

Numerical Finally exception


programming
NumPy package try:
Array basics print(abc)
Linear algebra
except:
Data formats and
handling
print("Something went wrong")
Pandas package finally:
Series print("This will always be displayed")
DataFrame
Import/Export data
## Something went wrong
Visual
illustrations
## This will always be displayed
Matplotlib package
try:
print("Hello World")
Figures and subplots
Plot types and styles
Pandas layers except:
Applications print("Something went wrong")
Time series finally:
Moving window
print("This will always be displayed")
Financial applications
Optimization
## Hello World
## This will always be displayed
© 2020 PyEcon.org
Raise exception 78
Essential
concepts
Getting started Built-in exceptions are raised whenever pre-defined interpreter errors
Procedural
programming occur. In some situations you might want to raise exceptions on your
Object-orientation
own:
Numerical
programming
NumPy package The raise keyword is used to raise an exception.
Array basics
Linear algebra In the following, the interpreter raises an error if the variable x is lower
Data formats and
handling
than 0:
Pandas package
Series Raise exception
DataFrame
Import/Export data x = -3
Visual if x < 0:
illustrations raise Exception("Sorry, ’x’ is lower than 0.")
Matplotlib package
Figures and subplots ## <stdin> in <module>()
Plot types and styles
## ----> 3 raise Exception(Sorry, 'x' is lower than 0.)
Pandas layers
## Exception: Sorry, 'x' is lower than 0.
Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
EAFP versus LBYL 79
Essential
concepts
Getting started LBYL: Look before you leap.
Procedural
programming EAFP: It is easier to ask forgiveness than it is to get permission.
Object-orientation

Numerical LBYL and EAFP are two techniques to deal (i.e., avoid) with exceptions.
programming
NumPy package In short, in LBYL you first check whether something will succeed and
Array basics
Linear algebra
only proceed if it does. EAFP means that you do what you expect and
Data formats and
if an exception might occur, you deal with it:
handling
Pandas package
Series
LBYL
DataFrame if x != 0:
Import/Export data
print(10 / x)
Visual
illustrations
Matplotlib package
Figures and subplots EAFP
Plot types and styles
try:
Pandas layers
print(10 / x)
Applications
Time series
except ZeroDivisionError:
Moving window pass
Financial applications
Optimization

© 2020 PyEcon.org
EAFP versus LBYL 80
Essential
concepts
Getting started So, why use EAFP although it needs more lines of code?
Procedural
programming
Object-orientation
Often, the code is more readable and straight.
Numerical Explicit is better than implicit (Zen of Python, see below).
programming
NumPy package Best performance in case no exception is raised.
Array basics
Linear algebra Detailed exception handling. You can not only consider errors, but
Data formats and
handling
also different kinds of errors and then proceed differently.
Pandas package
Series
DataFrame
EAFP
Import/Export data try:
Visual print(10 / x)
except ZeroDivisionError:
illustrations
Matplotlib package
Figures and subplots print("Zero division")
Plot types and styles except NameError:
Pandas layers
print("Variable ’x’ is not defined")
Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Built-in versus user-defined exceptions 81
Essential
concepts
Getting started Python has multiple built-in exceptions which terminate your program
Procedural
programming when something goes wrong. But you can also create custom exceptions
Object-orientation
that serve specific purposes.
Numerical
programming
NumPy package Your own exception can implemented by defining a new class which
Array basics
Linear algebra
derives from the Exception class or a subclass:
Data formats and
handling User-defined exception
Pandas package
Series class ValueTooLargeError(Exception):
DataFrame """Raised when the input value is too large"""
Import/Export data
pass
Visual x = 3
illustrations
Matplotlib package
try:
Figures and subplots if x > 2:
Plot types and styles raise ValueTooLargeError
except ValueTooLargeError:
Pandas layers

Applications
print("The number is too large.")
Time series
Moving window
Financial applications ## The number is too large.
Optimization

© 2020 PyEcon.org
Namespaces 82
Essential
concepts
Getting started We have already come into contact with namenspaces in Python many
Procedural
programming times. These are hierarchically linked layers in which the references to
Object-orientation
objects are defined. A rough distinction is made between
Numerical
programming
NumPy package the global namespace, and
Array basics
Linear algebra
the local namespace.
Data formats and
handling
Pandas package The global namespace is the outermost environment whose references
Series
DataFrame
are known by all objects.
Import/Export data
On the other hand, locally defined references are only known in a local,
Visual
illustrations i.e., internal environment.
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Namespaces 83
Essential
concepts
Getting started Reference names from the local namespace mask the same names in
Procedural
programming an outer or in the global namespace:
Object-orientation

Numerical
programming
Namespaces
NumPy package
def multiplier(x):
x = 4 * x
Array basics
Linear algebra

Data formats and


return x
handling x = "OH"
Pandas package multiplier("AH")
Series
DataFrame
multiplier(x)
Import/Export data x
Visual ## OH
illustrations
## AHAHAHAH
Matplotlib package
Figures and subplots
## OHOHOHOH
Plot types and styles ## OH
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Namespaces 84
Essential
concepts
Getting started In fact, functions defined in Python are themselves objects that remem-
Procedural
programming ber and can access their own context where they were created. This
Object-orientation
concept comes from functional programming and is called closure:
Numerical
programming
NumPy package Closures
Array basics
Linear algebra
def gen_multiplier(a):
Data formats and
def fun(x):
handling return a * x
Pandas package
return fun
Series
DataFrame
Import/Export data multi1 = gen_multiplier(4)
Visual multi2 = gen_multiplier(5)
illustrations multi1
Matplotlib package
Figures and subplots
multi1("EH")
Plot types and styles multi2("EH")
Pandas layers
## <function gen_multiplier.<locals>.fun at 0x7f654804cd30>
Applications ## EHEHEHEH
## EHEHEHEHEH
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Managing code 85
Essential
concepts
Getting started In order to provide, maintain and extend modular functionality with
Procedural
programming Python, its code containing components can be described hierarchically:
Object-orientation

Numerical
programming Packages
NumPy package
Array basics
Linear algebra Modules
Data formats and
handling
Pandas package
Classes
Series
DataFrame Functions
Import/Export data

Visual The organization in Python is very straightforward and is based on the


illustrations
Matplotlib package local namespaces mentioned before.
Figures and subplots
Plot types and styles When you download and use new packages, such as NumPy for numer-
Pandas layers
ical programming in the next chapter, the packages are loaded and the
Applications
Time series
namespaces initialized.
Moving window
Financial applications
The development of custom packages is an advanced topic and not
Optimization essential for a reasonable code structure of small projects, as it is in
other programming languages.
© 2020 PyEcon.org
Importing modules 86
Essential
concepts
Getting started Modules provide classes and functions via namespaces. It is Python
Procedural
programming code that is executed in a local namespace and whose classes and
Object-orientation
functions you can import. Basically, there are the following alternatives
Numerical
programming how to import from an module:
NumPy package
Array basics
Linear algebra
Import statements
Data formats and import datetime
handling
Pandas package
import datetime as dt
Series from datetime import date, timedelta
DataFrame from datetime import *
Import/Export data

Visual dt.date.today()
illustrations
Matplotlib package
dt.timedelta.days
Figures and subplots
Plot types and styles date.today()
timedelta.days
Pandas layers

Applications
Time series
Moving window
datetime.now()
Financial applications
Optimization
In the latter case, all classes and functions, but no instances, are
imported from the datetime namespace.
© 2020 PyEcon.org
Build-in modules 87
Essential
concepts
Getting started A Python installation ships with a standard library consisting of built-
Procedural
programming in modules. These modules provide standardized solutions for many
Object-orientation
problems that occur in everyday programming - “batteries included”.
Numerical
programming For example, they provide access to system functionality such as file
NumPy package
Array basics
management. The Python Docs give an overview of all build-in modules.
Linear algebra

Data formats and Usage of build-in modules


handling
Pandas package import math
Series from random import randint
DataFrame
Import/Export data
math.pi
Visual
illustrations
Matplotlib package ## 3.141592653589793
Figures and subplots
Plot types and styles math.factorial(5)
Pandas layers

Applications ## 120
Time series
Moving window
Financial applications
randint(10, 20)
Optimization
## 18

© 2020 PyEcon.org
Installing modules 88
Essential
concepts
Getting started Often you might want to use extended functionality. Python has a large
Procedural
programming and active community of users who make their developments publicly
Object-orientation
available under open source license terms. Packages are containers of
Numerical
programming modules which can be imported and used within your Python code.
NumPy package
Array basics These third-party packages can be installed comfortably by using the
Linear algebra
(command line) package manager pip. The Python Package Index
Data formats and
handling provides an overview of the thousands of packages available. Basic
Pandas package
Series
commands for maintaining, for example, the installation of the package
DataFrame “numpy”:
Import/Export data

Visual Installing the package: pip install numpy


illustrations
Matplotlib package Upgrading the package: pip install –upgrade numpy
Figures and subplots
Plot types and styles Installing the package locally for the current user:
pip install –user numpy
Pandas layers

Applications
Time series
Uninstalling the package: pip uninstall numpy
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Installing modules 89
Essential
concepts
Getting started Example: OpenCV is a package for image processing in Python. Here
Procedural
programming you can see how the installation proceeds in a Unix terminal.
Object-orientation

Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Writing modules 90
Essential
concepts
Getting started Your Python projects will become complex and you will need to main-
Procedural
programming tain the codes properly. Therefore, one can break a large, unwieldy
Object-orientation
programming task into separate, more manageable modules. Modules
Numerical
programming can be written in Python itself or in C, but here we keep focussing on
NumPy package
Array basics
the Python language.
Linear algebra
Creating modules in Python is very straightforward - a Python module
Data formats and
handling is a file containing Python code, for example:
Pandas package
Series
DataFrame s = "Hello world!"
Import/Export data
l = [1, 2, 3, 5, 5]
Visual
illustrations
Matplotlib package
Figures and subplots
def add_one(n):
Plot types and styles return n + 1
Pandas layers

Applications
Time series File: mymodule.py
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Working with modules 91
Essential
concepts
Getting started If you import the module mymodule, the interpreter looks in the
Procedural
programming current working directory for a file mymodule.py, reads and interprets
Object-orientation
its contents and makes its namespace available:
Numerical
programming
NumPy package Usage of own modules
Array basics
Linear algebra import mymodule
Data formats and mymodule.s
handling mymodule.l
Pandas package
mymodule.add_one(5)
Series
DataFrame
## Hello world!
## [1, 2, 3, 5, 5]
Import/Export data

Visual
illustrations
## 6
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Python packages 92
Essential
concepts
Getting started Large projects could require more than one module. Packages allow
Procedural
programming to structure the modules and their namespaces hierarchically by using
Object-orientation
the dot notation. They are simple folders containing modules and
Numerical
programming (sub-)packages. Consider the following structure:
NumPy package
Array basics
Linear algebra

Data formats and


handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations The directory mypackage contains two modules which we can import
Matplotlib package
Figures and subplots separately:
Plot types and styles
Pandas layers
Usage of own package
Applications
Time series import mypackage.mymodule
Moving window import mypackage.somemodule
Financial applications
Optimization
mypackage.mymodule.add_one(4)
## 5
© 2020 PyEcon.org
Package initialization 93
Essential
concepts
Getting started If a package directory contains a file __init__.py, its code is invoked
Procedural
programming when the package gets imported. The directory mypackage, now,
Object-orientation
contains the two modules and the initialization file:
Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots The file __init__.py can be empty but can also be used for package
Plot types and styles
Pandas layers
initialization purposes.
Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
The Zen of Python 94
Essential
concepts
Getting started
Procedural The Zen of Python
programming
Object-orientation import this
Numerical
programming
## The Zen of Python, by Tim Peters
NumPy package ##
Array basics ##
Linear algebra
## Beautiful is better than ugly.
Data formats and
handling
## Explicit is better than implicit.
Pandas package
## Simple is better than complex.
Series ## Complex is better than complicated.
DataFrame
## Flat is better than nested.
## Sparse is better than dense.
Import/Export data

Visual
illustrations
## Readability counts.
Matplotlib package ## Special cases aren't special enough to break the rules.
Figures and subplots ## Although practicality beats purity.
Plot types and styles
Pandas layers
## Errors should never pass silently.
## Unless explicitly silenced.
Applications
Time series
## In the face of ambiguity, refuse the temptation to guess.
Moving window ## ...
Financial applications
Optimization

© 2020 PyEcon.org
Further topics 95
Essential
concepts
Getting started A selection of exciting topics that are among the advanced basics but
Procedural
programming are not covered in this lecture:
Object-orientation

Numerical
programming
Dynamic language concepts, such as duck typing,
NumPy package
Array basics
Further, complex type classes, such as ChainMap or OrderedDict,
Linear algebra
Iterators and generators in detail,
Data formats and
handling
Pandas package
Exception handling, raising exceptions, catching errors,
Series
DataFrame
Debugging, introspection and annotations.
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Chapter 2 96
Essential
concepts
Getting started
Procedural
programming
Object-orientation
Numerical programming
Numerical
programming
NumPy package
Array basics
2.1 NumPy package
Linear algebra

Data formats and


2.2 Array basics
handling
Pandas package 2.3 Linear algebra
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Section 2.1 97
Essential
concepts
Getting started
Procedural
programming
Object-orientation
Numerical programming
Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


I NumPy package
handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
The NumPy package 98
Essential
concepts
Getting started
Procedural
programming
Object-orientation

Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


handling
Pandas package

The Numerical Python package NumPy provides efficient tools for sci-
Series
DataFrame
Import/Export data
entific computing and data analysis:
Visual
illustrations np.array(): Multidimensional array capable of doing fast and
Matplotlib package
Figures and subplots efficient computations,
Plot types and styles
Pandas layers Built-in mathematical functions on arrays without writing loops,
Applications
Time series
Built-in linear algebra functions.
Moving window
Financial applications
Optimization
Import NumPy
import numpy as np
© 2020 PyEcon.org
Motivation 99
Essential
concepts
Getting started
Procedural Element-wise addition
programming
Object-orientation vec1 = [1, 2, 3, 4, 5, 6, 7, 8, 9]
Numerical vec2 = np.array(vec1)
programming vec1 + vec1
NumPy package
Array basics
Linear algebra
## [1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9]
Data formats and
handling
vec2 + vec2
Pandas package
Series ## array([ 2, 4, 6, 8, 10, 12, 14, 16, 18])
DataFrame
Import/Export data
for i in range(len(vec1)):
Visual vec1[i] += vec1[i]
illustrations
Matplotlib package
vec1
Figures and subplots
Plot types and styles ## [2, 4, 6, 8, 10, 12, 14, 16, 18]
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Motivation 100
Essential
concepts
Getting started
Procedural Matrix multiplication
programming
Object-orientation mat1 = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
Numerical mat2 = np.array(mat1)
programming
np.dot(mat2, mat2)
NumPy package
Array basics
Linear algebra ## array([[ 30, 36, 42],
Data formats and
## [ 66, 81, 96],
handling ## [102, 126, 150]])
Pandas package
Series
mat3 = np.zeros([3, 3])
DataFrame
Import/Export data for i in range(3):
Visual
for k in range(3):
illustrations for j in range(3):
Matplotlib package
mat3[i][k] = mat3[i][k] + mat1[i][j] * mat1[j][k]
mat3
Figures and subplots
Plot types and styles
Pandas layers

Applications
## array([[ 30., 36., 42.],
Time series ## [ 66., 81., 96.],
Moving window ## [102., 126., 150.]])
Financial applications
Optimization

© 2020 PyEcon.org
Motivation 101
Essential
concepts
Getting started
Procedural Time comparison
programming
Object-orientation import time
Numerical mat1 = np.random.rand(50, 50)
programming mat2 = np.array(mat1)
t = time.time()
NumPy package
Array basics
Linear algebra mat3 = np.dot(mat2, mat2)
Data formats and nptime = time.time() - t
handling mat3 = np.zeros([50, 50])
Pandas package
Series
t = time.time()
DataFrame for i in range(50):
Import/Export data for k in range(50):
Visual for j in range(50):
illustrations
mat3[i][k] = mat3[i][k] + mat1[i][j] * mat1[j][k]
pytime = time.time() - t
Matplotlib package
Figures and subplots
Plot types and styles times = str(pytime / nptime)
Pandas layers print("NumPy is " + times + " times faster!")
Applications
Time series ## NumPy is 41.55256533479793 times faster!
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Section 2.2 102
Essential
concepts
Getting started
Procedural
programming
Object-orientation
Numerical programming
Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


I Array basics
handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Creating NumPy arrays 103
Essential
concepts
Getting started np.array(list): Converts python list into NumPy arrays.
Procedural
programming array.ndim: Returns Dimension of the array.
Object-orientation
array.shape: Returns shape of the array as a list.
Numerical
programming
NumPy package Creation
Array basics
Linear algebra arr1 = [4, 8, 2]
Data formats and
arr1 = np.array(arr1)
handling arr2 = np.array([24.3, 0., 8.9, 4.4, 1.65, 45])
Pandas package
arr3 = np.array([[4, 8, 5], [9, 3, 4], [1, 0, 6]])
arr1.ndim
Series
DataFrame
Import/Export data

Visual
## 1
illustrations
Matplotlib package arr3.shape
Figures and subplots
Plot types and styles
Pandas layers
## (3, 3)
Applications
Time series
Moving window From now on, the name array refers to an np.array().
Financial applications
Optimization

© 2020 PyEcon.org
Array creation functions 104
Essential
concepts
Getting started np.arange(start, stop, step): Creates vector of values from start
Procedural
programming to stop with step width step.
Object-orientation
np.zeros((rows, columns)): Creates array with all values set to 0.
Numerical
programming np.identity(n): Creates identity matrix of dimension n.
NumPy package
Array basics
Linear algebra
Creation functions
Data formats and np.zeros((4, 3))
handling
Pandas package
Series
## array([[0., 0., 0.],
DataFrame ## [0., 0., 0.],
Import/Export data ## [0., 0., 0.],
Visual ## [0., 0., 0.]])
illustrations
Matplotlib package
np.arange(6)
Figures and subplots
Plot types and styles
Pandas layers ## array([0, 1, 2, 3, 4, 5])
Applications
Time series
np.identity(3)
Moving window
Financial applications ## array([[1., 0., 0.],
Optimization
## [0., 1., 0.],
## [0., 0., 1.]])
© 2020 PyEcon.org
Array creation functions 105
Essential
concepts
Getting started np.linspace(start, stop, n): Creates vector of n evenly divided
Procedural
programming values from start to stop.
Object-orientation
np.full((row, column), k): Creates array with all values set to k.
Numerical
programming
NumPy package Array creation
Array basics
Linear algebra np.linspace(0, 80, 5)
Data formats and
handling ## array([ 0., 20., 40., 60., 80.])
Pandas package
Series
DataFrame
np.full((5, 4), 7)
Import/Export data
## array([[7, 7, 7, 7],
Visual
illustrations ## [7, 7, 7, 7],
Matplotlib package ## [7, 7, 7, 7],
Figures and subplots
## [7, 7, 7, 7],
Plot types and styles
Pandas layers ## [7, 7, 7, 7]])
Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Array creation functions 106
Essential
concepts
Getting started np.random.rand(rows, columns): Creates array of random floats
Procedural
programming between zero and one.
Object-orientation
np.rondom.randint(k, size=(rows, columns)): Creates array of
Numerical
programming random integers between 0 and k-1.
NumPy package
Array basics
Linear algebra
Array of random numbers
Data formats and np.random.rand(3, 3)
handling
Pandas package
Series
## array([[0.01014591, 0.55955228, 0.48103055],
DataFrame ## [0.30368877, 0.99078572, 0.61537046],
Import/Export data
## [0.83572553, 0.45976471, 0.63241975]])
Visual
illustrations
np.random.randint(10, size=(5, 4))
Matplotlib package
Figures and subplots
Plot types and styles ## array([[7, 9, 7, 8],
Pandas layers ## [0, 6, 7, 5],
Applications ## [7, 3, 4, 7],
Time series ## [9, 4, 4, 8],
Moving window
## [8, 0, 6, 1]])
Financial applications
Optimization

© 2020 PyEcon.org
Copy arrays 107
Essential
concepts
Getting started
Procedural Reference
programming
Object-orientation arr3
Numerical
programming ## array([[4, 8, 5],
NumPy package ## [9, 3, 4],
Array basics
## [1, 0, 6]])
Linear algebra

Data formats and


handling
arr = arr3
Pandas package arr[1, 1] = 777
Series arr3
DataFrame
Import/Export data
## array([[ 4, 8, 5],
Visual
illustrations
## [ 9, 777, 4],
Matplotlib package ## [ 1, 0, 6]])
Figures and subplots
Plot types and styles arr3[1, 1] = 3
Pandas layers

Applications
Time series
Moving window
call-by-reference
Financial applications
Optimization
arr = arr3 binds arr to the existing arr3. They both refer to the
same object.
© 2020 PyEcon.org
Copy array 108
Essential
concepts
Getting started array.copy(): Copies an array without reference (call-by-value).
Procedural
programming
Object-orientation

Numerical Copy Reference


programming
NumPy package arr3 arr3
Array basics
Linear algebra
## array([[4, 8, 5], ## array([[4, 8, 5],
Data formats and ## [9, 3, 4], ## [9, 3, 4],
handling
## [1, 0, 6]]) ## [1, 0, 6]])
Pandas package
Series
DataFrame arr = arr3.copy() arr = arr3
Import/Export data
arr[1, 1] = 777 arr[1, 1] = 777
Visual arr3 arr3
illustrations
Matplotlib package
## array([[4, 8, 5], ## array([[ 4, 8, 5],
Figures and subplots
Plot types and styles ## [9, 3, 4], ## [ 9, 777, 4],
Pandas layers
## [1, 0, 6]]) ## [ 1, 0, 6]])
Applications
Time series arr3[1, 1] = 3
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Overview: Array creation functions 109
Essential
concepts
Getting started
Procedural
programming
Object-orientation
Function Description
Numerical
programming array Convert input array in NumPy array
NumPy package
Array basics arange(start,stop,step) Creates array from given input
Linear algebra
ones Creates array containing only ones
Data formats and
handling zeros Creates array containing only zeros
Pandas package
Series
empty Allocating memory without specific values
DataFrame eye, identity Creates N x N identity matrix
Import/Export data

Visual
linspace Creats array of evenly divided values
illustrations full Creates array with values set to one number
Matplotlib package
Figures and subplots random.rand Creates array of random floats
Plot types and styles
Pandas layers
random.randint Creates array of random int
Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Data types of arrays 110
Essential
concepts
Getting started array.dtype: Returns the type of array.
Procedural
programming array.astype(np.type): Conducts a manual typecast.
Object-orientation

Numerical
programming
Data types
NumPy package arr1.dtype
Array basics
Linear algebra
## dtype('int64')
Data formats and
handling
Pandas package arr2.dtype
Series
DataFrame ## dtype('float64')
Import/Export data

Visual arr1 = arr1 * 2.5


arr1.dtype
illustrations
Matplotlib package
Figures and subplots
Plot types and styles ## dtype('float64')
Pandas layers

Applications arr1 = (arr1 / 2.5).astype(np.int64)


Time series arr1.dtype
Moving window
Financial applications
Optimization
## dtype('int64')

© 2020 PyEcon.org
Array operations 111
Essential
concepts
Getting started
Procedural
Element-wise operations
programming
Object-orientation
Calculation operators on NumPy arrays operate element-wise.
Numerical
programming
NumPy package
Array basics
Element-wise operations
Linear algebra
arr3
Data formats and
handling
## array([[4, 8, 5],
Pandas package
Series
## [9, 3, 4],
DataFrame ## [1, 0, 6]])
Import/Export data

Visual arr3 + arr3


illustrations
Matplotlib package
## array([[ 8, 16, 10],
Figures and subplots
Plot types and styles ## [18, 6, 8],
Pandas layers ## [ 2, 0, 12]])
Applications
Time series arr3**2
Moving window
Financial applications
## array([[16, 64, 25],
## [81, 9, 16],
Optimization

## [ 1, 0, 36]])
© 2020 PyEcon.org
Array operations 112
Essential
concepts
Getting started
Procedural Matrix multiplication
programming
Object-orientation
Operator * applied on arrays does not do the matrix multiplication.
Numerical
programming
NumPy package
Array basics
Element-wise operations
Linear algebra
arr3 * arr3
Data formats and
handling
Pandas package
## array([[16, 64, 25],
Series ## [81, 9, 16],
DataFrame ## [ 1, 0, 36]])
Import/Export data

Visual arr = np.ones((3, 2))


illustrations
Matplotlib package
arr
Figures and subplots
Plot types and styles ## array([[1., 1.],
Pandas layers ## [1., 1.],
Applications ## [1., 1.]])
Time series
Moving window arr3 * arr # not defined for element-wise multiplication
## ValueError: operands could not be broadcast together
Financial applications
Optimization

© 2020 PyEcon.org
Integer indexing 113
Essential
concepts
Getting started array[index]: Selects the value at position index from the data.
Procedural
programming
Object-orientation Indexing with an integer
Numerical
programming arr = np.arange(10)
NumPy package arr
Array basics
Linear algebra
## array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
Data formats and
handling
Pandas package
arr[4]
Series
DataFrame ## 4
Import/Export data

Visual arr[-1]
illustrations
Matplotlib package
## 9
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Slicing 114
Essential
concepts
Getting started array[start : stop : step]: Selects a subset of the data.
Procedural
programming
Object-orientation Slicing in one dimension
Numerical
programming arr = np.arange(10)
NumPy package arr
Array basics
Linear algebra
## array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
Data formats and
handling
Pandas package
arr[3:7]
Series
DataFrame ## array([3, 4, 5, 6])
Import/Export data

Visual arr[1:]
illustrations
Matplotlib package
## array([1, 2, 3, 4, 5, 6, 7, 8, 9])
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Slicing 115
Essential
concepts
Getting started
Procedural Slicing in one dimension with steps
programming
Object-orientation arr[:7]
Numerical
programming ## array([0, 1, 2, 3, 4, 5, 6])
NumPy package

arr[-3:]
Array basics
Linear algebra

Data formats and


handling ## array([7, 8, 9])
Pandas package
Series arr[::-1]
DataFrame
Import/Export data
## array([9, 8, 7, 6, 5, 4, 3, 2, 1, 0])
Visual

arr[::2]
illustrations
Matplotlib package
Figures and subplots
Plot types and styles ## array([0, 2, 4, 6, 8])
Pandas layers

Applications arr[:5:-1]
Time series
Moving window ## array([9, 8, 7, 6])
Financial applications
Optimization

© 2020 PyEcon.org
Slicing 116
Essential
concepts
Getting started
Procedural Slicing in higher dimensions
programming
Object-orientation
In n-dimensional arrays the element at each index is an
Numerical
programming (n − 1)-dimensional array.
NumPy package
Array basics
Linear algebra Indexing rows
Data formats and
handling arr3
Pandas package
Series ## array([[4, 8, 5],
## [9, 3, 4],
DataFrame
Import/Export data
## [1, 0, 6]])
Visual
illustrations
Matplotlib package vec = arr3[1]
Figures and subplots vec
Plot types and styles

## array([9, 3, 4])
Pandas layers

Applications
Time series
Moving window
arr3[-1]
Financial applications
Optimization ## array([1, 0, 6])

© 2020 PyEcon.org
Slicing 117
Essential
concepts
Getting started
Procedural Slicing in two dimensions
programming
Object-orientation arr3
Numerical
programming ## array([[4, 8, 5],
NumPy package
## [9, 3, 4],
Array basics
Linear algebra
## [1, 0, 6]])
Data formats and
handling arr3[0:2, 0:2]
Pandas package
Series ## array([[4, 8],
DataFrame
## [9, 3]])
Import/Export data

Visual
illustrations
arr3[2:, :]
Matplotlib package
Figures and subplots ## array([[1, 0, 6]])
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Slicing 118
Essential
concepts
Getting started
Procedural
programming
Object-orientation

Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization
Figure: Python for Data Analysis (2017) on page 99

© 2020 PyEcon.org
Views on arrays 119
Essential
concepts
Getting started So far, selecting by index numbers or slicing belongs to basic indexing
Procedural
programming in NumPy. With basic indexing you get NO COPY of your data but a
Object-orientation
so-called view on the existing data set – a different perspective.
Numerical
programming A view on an array can be seen as a reference to a rectangular memory
NumPy package
Array basics
area of its values. The view is intended to
Linear algebra
edit a rectangular part of a matrix, e.g., a sub-matrix, a column,
Data formats and
handling or a single value,
Pandas package
Series change the shape of the matrix or the arrangement of its elements,
DataFrame
Import/Export data
e.g., transpose or reshape a matrix,
Visual
illustrations
change the visual representation of values, e.g., to cast a float
Matplotlib package array into an int array,
Figures and subplots
Plot types and styles map the values in other program areas.
Pandas layers

Applications The crucial point here is that for efficiency reasons data arrays in your
Time series
working memory do not have to be copied again and again for simple
Moving window
Financial applications index operations, which would require an excessive additional effort
Optimization
writing to the computer memory.

© 2020 PyEcon.org
Creating views implicitly 120
Essential
concepts
Getting started A view is created automatically when you do basic indexing such as
Procedural
programming slicing:
Object-orientation

Numerical
programming
Create a view by slicing
NumPy package
column = arr3[:, 1]
column
Array basics
Linear algebra

Data formats and


handling ## array([8, 3, 0])
Pandas package
Series column.base
DataFrame
Import/Export data
## array([[4, 8, 5],
Visual
illustrations
## [9, 3, 4],
Matplotlib package ## [1, 0, 6]])
Figures and subplots
Plot types and styles column[1] = 100
Pandas layers
arr3
Applications
Time series
## array([[ 4, 8, 5],
## [ 9, 100, 4],
Moving window
Financial applications
Optimization ## [ 1, 0, 6]])

© 2020 PyEcon.org
Creating views implicitly 121
Essential
concepts
Getting started
Procedural Create a view by slicing
programming
Object-orientation elem = column[1:2]
Numerical elem.base
programming
NumPy package
Array basics
## array([[ 4, 8, 5],
Linear algebra ## [ 9, 100, 4],
Data formats and
## [ 1, 0, 6]])
handling
Pandas package elem[0] = 3
Series
arr3
DataFrame
Import/Export data
## array([[4, 8, 5],
Visual
illustrations ## [9, 3, 4],
Matplotlib package ## [1, 0, 6]])
Figures and subplots
Plot types and styles
Pandas layers

Applications The middle column is a view of the base array referenced by arr3,
Time series
Moving window Any changes to the values of a view directly affect the base data,
Financial applications
Optimization A view of a view is another view on the same base matrix.

© 2020 PyEcon.org
Obtaining views explicitly 122
Essential
concepts
Getting started In addition, an array contains methods and attributes that return a
Procedural
programming view of its data:
Object-orientation

Numerical Obtain a view


programming
NumPy package
arr3_t = arr3.T
Array basics arr3_t
Linear algebra

Data formats and ## array([[4, 9, 1],


handling
## [8, 3, 0],
Pandas package
Series
## [5, 4, 6]])
DataFrame
Import/Export data arr3_t.flags.owndata
Visual
illustrations ## False
Matplotlib package
Figures and subplots
Plot types and styles
arr3_r = arr3.reshape(1, 9)
Pandas layers arr3_r
Applications
Time series
## array([[4, 8, 5, 9, 3, 4, 1, 0, 6]])
Moving window
Financial applications arr3_t.flags.owndata
Optimization

## False
© 2020 PyEcon.org
Obtaining views explicitly 123
Essential
concepts
Getting started
Procedural Obtain a view
programming
Object-orientation arr3_v = arr3.view()
Numerical arr3_v.flags.owndata
programming
NumPy package ## False
Array basics
Linear algebra

Data formats and


handling The transposed matrix is a predefined view that is available as an
Pandas package
Series
attribute,
DataFrame
Import/Export data
Reshaping is also just another way of looking at the same set of
Visual data,
illustrations
Matplotlib package By means of the method view() you create a view with an identical
Figures and subplots
Plot types and styles
representation.
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Fancy indexing 124
Essential
concepts
Getting started The behavior described above changes with advanced indexing, i. e., if
Procedural
programming at least one component of the index tuple is not a scalar index number
Object-orientation
or slice. The case of fancy indexing is described below:
Numerical
programming
NumPy package Advanced and basic indexing
Array basics
Linear algebra arr3
Data formats and
handling ## array([[4, 8, 5],
Pandas package ## [9, 3, 4],
Series
DataFrame
## [1, 0, 6]])
Import/Export data

Visual
arr = arr3[[0, 2], [0, 2]]
illustrations arr
Matplotlib package
Figures and subplots
## array([4, 6])
Plot types and styles
Pandas layers
arr.base
Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Fancy indexing 125
Essential
concepts
Getting started
Procedural
Advanced and basic indexing
programming
Object-orientation arr = arr3[0:3:2, 0:3:2]
Numerical arr
programming
NumPy package
## array([[4, 5],
Array basics
Linear algebra
## [1, 6]])
Data formats and
handling arr.base
Pandas package
Series ## array([[4, 8, 5],
DataFrame
Import/Export data
## [9, 3, 4],
## [1, 0, 6]])
Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Contrary to intuition, fancy indexing does not return a (2 × 2)-
Pandas layers
matrix, but a vector of the matrix elements (0, 0) and (2, 2). This
Applications
Time series
is a complete copy – a new object and not a view to the original
Moving window matrix.
Financial applications
Optimization A submatrix (view) with the corner elements of the initial matrix
can be obtained with slicing.
© 2020 PyEcon.org
Boolean arrays 126
Essential
concepts
Getting started A boolean array is a NumPy array with boolean True and False values.
Procedural
programming Such an array can be created by applying a comparison operator on
Object-orientation
NumPy arrays.
Numerical
programming
NumPy package Boolean arrays
Array basics
Linear algebra bool_arr = (arr3 < 5)
Data formats and bool_arr
handling
Pandas package
## array([[ True, False, False],
## [False, True, True],
Series
DataFrame
Import/Export data ## [ True, True, False]])
Visual
illustrations bool_arr1 = (arr3 == 0)
Matplotlib package
bool_arr1
Figures and subplots
Plot types and styles
Pandas layers
## array([[False, False, False],
Applications
## [False, False, False],
Time series ## [False, True, False]])
Moving window
Financial applications
Optimization The comparison operators on arrays can be combined by means of
NumPy redefined bitwise operators.
© 2020 PyEcon.org
Boolean arrays 127
Essential
concepts
Getting started
Procedural Boolean arrays and bitwise operators
programming
Object-orientation a = np.array([3, 8, 4, 1, 9, 5, 2])
Numerical b = np.array([2, 3, 5, 6, 11, 15, 17])
programming
c = (a % 2 == 0) | (b % 3 == 0) # or
NumPy package
Array basics
c
Linear algebra

Data formats and


## array([False, True, True, True, False, True, True])
handling
Pandas package d = (a > b) ^ (a % 2 == 1) # exclusive or
Series
d
DataFrame
Import/Export data
## array([False, True, False, True, True, True, False])
Visual
illustrations
Matplotlib package c ^ d # exclusive or
Figures and subplots
Plot types and styles
## array([False, False, True, False, True, False, True])
Pandas layers

Applications
Time series
Moving window
Boolean arrays
Financial applications
Optimization Logical operations on NumPy arrays work in a similar way compared
to bitwise operators.
© 2020 PyEcon.org
Indexing with boolean arrays 128
Essential
concepts
Getting started Boolean arrays can be used to select elements of other NumPy arrays.
Procedural
programming If x is an array and y is a boolean array of the same dimension, then
Object-orientation
a[b] selects all the elements of x, for which the correspanding value (at
Numerical
programming the same position) of y is True.
NumPy package
Array basics
Linear algebra
Indexing with boolean arrays
Data formats and arr3
handling

## array([[4, 8, 5],
Pandas package
Series
DataFrame ## [9, 3, 4],
Import/Export data ## [1, 0, 6]])
Visual
illustrations y = arr3 % 2 == 0
y
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers ## array([[ True, True, False],
Applications ## [False, False, True],
Time series ## [False, True, True]])
Moving window
Financial applications
arr3[y]
Optimization

## array([4, 8, 4, 0, 6])
© 2020 PyEcon.org
Conditional indexing 129
Essential
concepts
Getting started Conditional indexing allows you using boolean arrays to select subsets
Procedural
programming of values and to avoid loops. Applying comparison operator on arrays,
Object-orientation
every element of the array is tested, if it corresponds to the logical
Numerical
programming condition. Consider an application setting all even numbers to 5:
NumPy package
Array basics
Linear algebra
Find and replace values in arrays
Data formats and a, b = arr3.copy(), arr3.copy()
handling
for i in range(a.shape[0]):
Pandas package
Series
for j in range(a.shape[1]):
DataFrame if a[i, j] % 2 == 0:
Import/Export data a[i, j] = 5
Visual
illustrations
Matplotlib package
b[b % 2 == 0] = 5
Figures and subplots b
Plot types and styles
Pandas layers ## array([[5, 5, 5],
Applications ## [9, 3, 5],
Time series
## [1, 5, 5]])
Moving window
Financial applications
Optimization np.allclose(a, b)

## True
© 2020 PyEcon.org
Conditional indexing 130
Essential
concepts
Getting started
Procedural Find and replace values in arrays, condition: equal
programming
Object-orientation arr3
Numerical
programming ## array([[4, 8, 5],
NumPy package
## [9, 3, 4],
Array basics
Linear algebra
## [1, 0, 6]])
Data formats and
handling arr = arr3.copy()
Pandas package arr[arr == 4] = 100
Series
arr
DataFrame
Import/Export data
## array([[100, 8, 5],
Visual
illustrations ## [ 9, 3, 100],
Matplotlib package ## [ 1, 0, 6]])
Figures and subplots
Plot types and styles
Pandas layers

Applications In this example, arr == 4 creates a boolean array as described


Time series
Moving window
before which is then used to index the array arr.
Financial applications
Optimization
Finally, every element of arr which is marked True according to
the boolean index array will be set to 100.
© 2020 PyEcon.org
Best practice: Indexing arrays 131
Essential
concepts
Getting started Step 1a
Procedural
programming Integer indexing array[row index, column index]: Indexing an n-
Object-orientation
dimensional array with n integer indices returns the single value at this
Numerical
programming position.
NumPy package
Array basics
Best practice Step 1a
Linear algebra

Data formats and mat = np.arange(12).reshape((3, 4))


handling
mat
Pandas package
Series
DataFrame
## array([[ 0, 1, 2, 3],
Import/Export data ## [ 4, 5, 6, 7],
Visual ## [ 8, 9, 10, 11]])
illustrations
Matplotlib package
mat[2, 2]
Figures and subplots
Plot types and styles
Pandas layers ## 10
Applications
Time series
mat[0, -1]
Moving window
Financial applications ## 3
Optimization

Keep in mind that, in this case only, the results are not arrays but
© 2020 PyEcon.org values!
Best practice: Indexing arrays 132
Essential
concepts
Getting started Step 1b
Procedural
programming Integer indexing array[row index]: In n-dimensional arrays, the ele-
Object-orientation
ment at each index is an (n − 1)-dimensional array.
Numerical
programming
NumPy package Best practice Step 1b
Array basics
Linear algebra mat = np.arange(12).reshape((3, 4))
Data formats and mat
handling
Pandas package
## array([[ 0, 1, 2, 3],
Series
DataFrame
## [ 4, 5, 6, 7],
Import/Export data ## [ 8, 9, 10, 11]])
Visual
illustrations mat[2]
Matplotlib package
Figures and subplots
Plot types and styles
## array([ 8, 9, 10, 11])
Pandas layers
mat[0]
Applications
Time series
Moving window ## array([0, 1, 2, 3])
Financial applications

By specifying the row index only, we create arrays which are views.
Optimization

© 2020 PyEcon.org
Best practice: Indexing arrays 133
Essential
concepts
Getting started Step 2a
Procedural
programming Slicing array[start : stop : step]: Slicing can be used separately
Object-orientation
for rows and columns.
Numerical
programming
NumPy package
Best practice Step 2a
Array basics
Linear algebra
mat = np.arange(12).reshape((3, 4))
Data formats and
mat
handling
Pandas package ## array([[ 0, 1, 2, 3],
Series
## [ 4, 5, 6, 7],
DataFrame
Import/Export data
## [ 8, 9, 10, 11]])
Visual
illustrations mat[0:2]
Matplotlib package
Figures and subplots ## array([[0, 1, 2, 3],
Plot types and styles
## [4, 5, 6, 7]])
Pandas layers

Applications
mat[0:2, ::2]
Time series
Moving window
Financial applications
## array([[0, 2],
Optimization ## [4, 6]])

© 2020 PyEcon.org
Best practice: Indexing arrays 134
Essential
concepts
Getting started Step 2b
Procedural
programming A frequent task is to get a specific row or column of an array. This can
Object-orientation
be done easily by slicing.
Numerical
programming
NumPy package Best practice Step 2b
Array basics
Linear algebra mat
Data formats and
handling ## array([[ 0, 1, 2, 3],
Pandas package ## [ 4, 5, 6, 7],
Series
DataFrame
## [ 8, 9, 10, 11]])
Import/Export data
row = mat[1] # get second row
Visual
illustrations column = mat[:, 2] # get third column
Matplotlib package row
Figures and subplots

## array([4, 5, 6, 7])
Plot types and styles
Pandas layers

Applications
Time series
column
Moving window
Financial applications ## array([ 2, 6, 10])
Optimization

Slicing with [:] means to take every element from the first to the last.
© 2020 PyEcon.org
Best practice: Indexing arrays 135
Essential
concepts
Getting started Step 3
Procedural
programming Fancy indexing array[rows list, columns list]: Return a one di-
Object-orientation
mensional array with the values at the index tuples specified elementwise
Numerical
programming by the index lists.
NumPy package
Array basics
Linear algebra
Best practice Step 3
Data formats and mat = np.arange(12).reshape((3, 4))
handling
Pandas package
mat
Series
DataFrame ## array([[ 0, 1, 2, 3],
Import/Export data
## [ 4, 5, 6, 7],
Visual ## [ 8, 9, 10, 11]])
illustrations

mat[[1, 2], [1, 2]]


Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers ## array([ 5, 10])
Applications
Time series mat[[0, -1], [-1]]
Moving window
Financial applications
## array([ 3, 11])
Optimization

The index lists might also contain just a single element.


© 2020 PyEcon.org
Best practice: Indexing arrays 136
Essential
concepts
Getting started Step 4
Procedural
programming Conditional indexing: Applying comparison operators to arrays, the
Object-orientation
boolean operations are evaluated elementwise in a vectorized fashion.
Numerical
programming
NumPy package Best practice Step 4
Array basics
Linear algebra bool_mat = mat > 0
Data formats and bool_mat
handling
Pandas package ## array([[False, True, True, True],
Series
DataFrame
## [ True, True, True, True],
Import/Export data ## [ True, True, True, True]])
Visual
illustrations mat[bool_mat] = 111 # equivalent to mat[mat > 0] = 111
Matplotlib package mat
Figures and subplots

## array([[ 0, 111, 111, 111],


Plot types and styles
Pandas layers

Applications
## [111, 111, 111, 111],
Time series
## [111, 111, 111, 111]])
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Best practice: Indexing arrays 137
Essential
concepts
Getting started Step 5
Procedural
programming Replacing values in arrays. Assigning a slice of an array to new values,
Object-orientation
the shape of slice must be considered.
Numerical
programming
NumPy package Best practice Step 5
Array basics
Linear algebra mat[0] = np.array([3, 2, 1]) # Fails because the shapes do not fit
Data formats and
handling
## Error: could not broadcast array from shape (3) into shape (4)
Pandas package
Series
mat[2, 3] = 100
DataFrame mat[:, 0] = np.array([3, 3, 3])
Import/Export data mat
Visual
illustrations ## array([[ 3, 111, 111, 111],
Matplotlib package
## [ 3, 111, 111, 111],
Figures and subplots
Plot types and styles ## [ 3, 111, 111, 100]])
Pandas layers

Applications mat[1:3, 1:3] = np.array([[0, 0], [0, 0]])


Time series mat
Moving window
Financial applications
## array([[ 3, 111, 111, 111],
## [ 3, 0, 0, 111],
Optimization

## [ 3, 0, 0, 100]])
© 2020 PyEcon.org
Reshaping arrays 138
Essential
concepts
Getting started array.reshape((rows, columns)): Reshapes an existing array.
Procedural
programming array.resize((rows, columns)): Changes array shape to rows x
Object-orientation
columns and fills new values with 0.
Numerical
programming
NumPy package
Reshape
Array basics
Linear algebra
arr = np.arange(15)
Data formats and
arr.reshape((3, 5))
handling
Pandas package ## array([[ 0, 1, 2, 3, 4],
Series
## [ 5, 6, 7, 8, 9],
DataFrame
Import/Export data
## [10, 11, 12, 13, 14]])
Visual
illustrations arr = np.arange(15)
Matplotlib package arr.resize((3, 7))
Figures and subplots
arr
Plot types and styles
Pandas layers
## array([[ 0, 1, 2, 3, 4, 5, 6],
Applications
Time series
## [ 7, 8, 9, 10, 11, 12, 13],
Moving window ## [14, 0, 0, 0, 0, 0, 0]])
Financial applications
Optimization

© 2020 PyEcon.org
Adding and removing elements of arrays 139
Essential
concepts
Getting started np.append(array, value): Appends value to the end of array.
Procedural
programming np.insert(array, index, value): Inserts values before index.
Object-orientation
np.delete(array, index, axis): Deletes row or column on index.
Numerical
programming
NumPy package Naming
Array basics
Linear algebra a = np.arange(5)
Data formats and a = np.append(a, 8)
handling a = np.insert(a, 3, 77)
Pandas package
Series
print(a)
DataFrame
Import/Export data ## [ 0 1 2 77 3 4 8]
Visual
illustrations a.resize((3, 3))
Matplotlib package
np.delete(a, 1, axis=0)
Figures and subplots
Plot types and styles
Pandas layers ## array([[0, 1, 2],
Applications
## [8, 0, 0]])
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Combining and splitting 140
Essential
concepts
Getting started np.concatenate((arr1, arr2), axis): Joins a sequence of arrays
Procedural
programming along an existing axis.
Object-orientation
np.split(array, n): Splits an array into multiple sub-arrays.
Numerical
programming np.hsplit(array, n): Splits an array into multiple sub-arrays hori-
NumPy package
Array basics
zontally.
Linear algebra

Data formats and Naming


handling
Pandas package np.concatenate((a, np.arange(6).reshape(2, 3)), axis=0)
Series
DataFrame
## array([[ 0, 1, 2],
## [77, 3, 4],
Import/Export data

Visual
illustrations
## [ 8, 0, 0],
Matplotlib package ## [ 0, 1, 2],
Figures and subplots ## [ 3, 4, 5]])
Plot types and styles

np.split(np.arange(8), 4)
Pandas layers

Applications
Time series
Moving window
## [array([0, 1]), array([2, 3]), array([4, 5]), array([6, 7])]
Financial applications
Optimization

© 2020 PyEcon.org
Transposing array 141
Essential
concepts
Getting started array.T: Returns the transposed array (as a view).
Procedural
programming
Object-orientation Transpose
Numerical
programming arr3
NumPy package
Array basics ## array([[4, 8, 5],
Linear algebra
## [9, 3, 4],
Data formats and ## [1, 0, 6]])
handling
Pandas package
Series arr3.T
DataFrame
Import/Export data ## array([[4, 9, 1],
Visual ## [8, 3, 0],
illustrations
## [5, 4, 6]])
Matplotlib package
Figures and subplots
Plot types and styles np.eye(3).T
Pandas layers

Applications ## array([[1., 0., 0.],


Time series ## [0., 1., 0.],
Moving window
## [0., 0., 1.]])
Financial applications
Optimization

© 2020 PyEcon.org
Matrix multiplication 142
Essential
concepts
Getting started np.dot(arr1, arr2): Conducts a matrix multiplication of arr1 and
Procedural
programming arr2. The @ operator can be used instead of the np.dot() function.
Object-orientation

Numerical
programming
Matrix multiplication
NumPy package
res = np.dot(arr3, np.arange(18).reshape((3, 6)))
Array basics
Linear algebra
res
Data formats and
handling ## array([[108, 125, 142, 159, 176, 193],
Pandas package ## [ 66, 82, 98, 114, 130, 146],
Series
## [ 72, 79, 86, 93, 100, 107]])
DataFrame
Import/Export data
res2 = arr3 @ np.arange(18).reshape((3, 6))
Visual
illustrations res2
Matplotlib package
Figures and subplots ## array([[108, 125, 142, 159, 176, 193],
Plot types and styles
## [ 66, 82, 98, 114, 130, 146],
Pandas layers
## [ 72, 79, 86, 93, 100, 107]])
Applications

np.allclose(res, res2)
Time series
Moving window
Financial applications
Optimization ## True

© 2020 PyEcon.org
Array functions 143
Essential
concepts
Getting started
Procedural Element-wise functions
programming
Object-orientation arr3
Numerical
programming ## array([[4, 8, 5],
NumPy package ## [9, 3, 4],
Array basics
## [1, 0, 6]])
Linear algebra

Data formats and


handling
np.sqrt(arr3)
Pandas package
Series ## array([[2. , 2.82842712, 2.23606798],
DataFrame ## [3. , 1.73205081, 2. ],
Import/Export data
## [1. , 0. , 2.44948974]])
Visual
illustrations
Matplotlib package
np.exp(arr3)
Figures and subplots
Plot types and styles ## array([[5.45981500e+01, 2.98095799e+03, 1.48413159e+02],
Pandas layers ## [8.10308393e+03, 2.00855369e+01, 5.45981500e+01],
Applications ## [2.71828183e+00, 1.00000000e+00, 4.03428793e+02]])
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Overview: Element-wise array functions 144
Essential
concepts
Getting started
Procedural
programming Function Description
Object-orientation
abs Absolute value of integer and floating point
Numerical
programming sqrt Sqare root
NumPy package
Array basics exp Exponential function
Linear algebra
log, log10, log2 Natural logarithm, log base 10, log base 2
Data formats and
handling sign Sign (1 : positiv, 0: zero, -1 : negative)
Pandas package
Series
ceil Rounding up to integer
DataFrame floor Round down to integer
Import/Export data

Visual
rint Round to nearest integer
illustrations
Matplotlib package
modf Returns fractional parts
Figures and subplots sin, cos, tan, sinh, cosh, tanh, arcsin, ...
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Binary functions 145
Essential
concepts
Getting started
Procedural Binary
programming
Object-orientation x = np.array([3, -6, 8, 4, 3, 5])
Numerical y = np.array([3, 5, 7, 3, 5, 9])
programming
np.maximum(x, y)
NumPy package
Array basics
Linear algebra ## array([3, 5, 8, 4, 5, 9])
Data formats and
handling np.greater_equal(x, y)
Pandas package
Series ## array([ True, False, True, True, False, False])
DataFrame

np.add(x, y)
Import/Export data

Visual
illustrations
Matplotlib package
## array([ 6, -1, 15, 7, 8, 14])
Figures and subplots
Plot types and styles np.mod(x, y)
Pandas layers

Applications ## array([0, 4, 1, 1, 3, 5])


Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Overview: Binary functions 146
Essential
concepts
Getting started
Procedural
programming Function Description
Object-orientation
add Add elements of arrays
Numerical
programming subtract Subtract elements in the second from the first array
NumPy package
Array basics multiply Multiply elements
Linear algebra
divide Divide elements
Data formats and
handling power Raise elements in first array to powers in second
Pandas package
Series
maximum Element-wise maximum
DataFrame minimum Element-wise minimum
Import/Export data

Visual
mod Element-wise modulus
illustrations
Matplotlib package
greater, less, equal gives boolean
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Data processing 147
Essential
concepts
Getting started np.meshgrid(array1, array2): Returns coordinate matrices from
Procedural
programming coordinate arrays.
Object-orientation
p
Numerical
programming Evaluate the function f (x , y ) = x 2 + y 2 on a 10 x 10 grid
NumPy package
Array basics p = np.arange(-5, 5, 0.01)
Linear algebra x, y = np.meshgrid(p, p)
Data formats and x
handling
Pandas package
Series
## array([[-5. , -4.99, -4.98, ..., 4.97, 4.98, 4.99],
DataFrame ## [-5. , -4.99, -4.98, ..., 4.97, 4.98, 4.99],
Import/Export data ## [-5. , -4.99, -4.98, ..., 4.97, 4.98, 4.99],
Visual ## ...,
illustrations
## [-5. , -4.99, -4.98, ..., 4.97, 4.98, 4.99],
## [-5. , -4.99, -4.98, ..., 4.97, 4.98, 4.99],
Matplotlib package
Figures and subplots
Plot types and styles ## [-5. , -4.99, -4.98, ..., 4.97, 4.98, 4.99]])
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Data processing 148
Essential
concepts
Getting started p
Procedural
programming
Evaluate the function f (x , y ) = x 2 + y 2 on a 10 x 10 grid.
Object-orientation
import matplotlib.pyplot as plt
Numerical
programming
val = np.sqrt(x**2 + y**2)
NumPy package plt.figure(figsize=(2, 2))
Array basics plt.imshow(val, cmap="hot")
Linear algebra
plt.colorbar()
Data formats and
handling
## <matplotlib.colorbar.Colorbar object at 0x7f6545c7f5b0>
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Data processing 149
Essential
concepts
Getting started p
Procedural
programming
Evaluate the function f (x , y ) = x 2 + y 2 on a 10 x 10 grid.
plt.show()
Object-orientation

Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


handling
Pandas package
6
Series

4
DataFrame
Import/Export data

Visual
illustrations

2
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

0
Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Conditional logic 150
Essential
concepts
Getting started np.where(condition, a, b): If condition is True, returns value a,
Procedural
programming otherwise returns b.
Object-orientation

Numerical Conditional logic


programming
NumPy package a = np.array([4, 7, 5, -7, 9, 0])
Array basics
b = np.array([-1, 9, 8, 3, 3, 3])
cond = np.array([True, True, False, True, False, False])
Linear algebra

Data formats and


handling res = np.where(cond, a, b)
Pandas package res
Series
DataFrame
## array([ 4, 7, 8, -7, 3, 3])
Import/Export data

Visual
illustrations
res = np.where(a <= b, b, a)
Matplotlib package
res
Figures and subplots
Plot types and styles ## array([4, 9, 8, 3, 9, 3])
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Conditional logic 151
Essential
concepts
Getting started
Procedural Conditional logic, examples
programming
Object-orientation arr3
Numerical
programming ## array([[4, 8, 5],
NumPy package
## [9, 3, 4],
Array basics
Linear algebra
## [1, 0, 6]])
Data formats and
handling res = np.where(arr3 < 5, 0, arr3)
Pandas package res
Series
DataFrame
## array([[0, 8, 5],
Import/Export data
## [9, 0, 0],
Visual
illustrations ## [0, 0, 6]])
Matplotlib package
Figures and subplots even = np.where(arr3 % 2 == 0, arr3, arr3 + 1)
Plot types and styles
even
Pandas layers

Applications
## array([[ 4, 8, 6],
## [10, 4, 4],
Time series
Moving window
Financial applications ## [ 2, 0, 6]])
Optimization

© 2020 PyEcon.org
Statistical methods 152
Essential
concepts
Getting started array.mean(): Computes the mean of all array elements.
Procedural
programming array.sum(): Computes the sum of all array elements.
Object-orientation

Numerical
programming
Statistical methods
NumPy package arr3
Array basics
Linear algebra
## array([[4, 8, 5],
Data formats and ## [9, 3, 4],
handling
Pandas package
## [1, 0, 6]])
Series
DataFrame arr3.mean()
Import/Export data

Visual ## 4.444444444444445
illustrations
Matplotlib package
Figures and subplots
arr3.sum()
Plot types and styles
Pandas layers ## 40
Applications
Time series arr3.argmin()
Moving window
Financial applications ## 7
Optimization

© 2020 PyEcon.org
Overview: Statistical methods 153
Essential
concepts
Getting started
Procedural
programming Method Description
Object-orientation
sum Sum of all array elements
Numerical
programming mean Mean of all array elements
NumPy package
Array basics std, var Standard deviation, variance
Linear algebra
min, max Minimum and Maximum value in array
Data formats and
handling argmin, argmax Indices of Minimum and Maximum value
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Axis 154
Essential
concepts
Getting started Axes are defined for arrays with more than one dimension. A two-
Procedural
programming dimensional array has two axes. The first one is running vertically
Object-orientation
downwards across the rows (axis=0), the second one running horizon-
Numerical
programming tally across the columns (axis=1).
NumPy package
Array basics
Linear algebra
Axis
Data formats and arr3
handling
Pandas package
## array([[4, 8, 5],
Series
DataFrame
## [9, 3, 4],
Import/Export data ## [1, 0, 6]])
Visual
illustrations arr3.sum(axis=0)
Matplotlib package
Figures and subplots
Plot types and styles
## array([14, 11, 15])
Pandas layers
arr3.sum(axis=1)
Applications
Time series
Moving window ## array([17, 16, 7])
Financial applications
Optimization

© 2020 PyEcon.org
Sorting 155
Essential
concepts
Getting started array.sort(axis): Sorts array by an axis.
Procedural
programming
Object-orientation Sorting one-dimensional arrays
Numerical
programming arr2
NumPy package
Array basics ## array([24.3 , 0. , 8.9 , 4.4 , 1.65, 45. ])
Linear algebra

Data formats and arr2.sort()


handling
arr2
Pandas package
Series
DataFrame ## array([ 0. , 1.65, 4.4 , 8.9 , 24.3 , 45. ])
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Sorting 156
Essential
concepts
Getting started
Procedural
Sorting two-dimensional arrays
programming
Object-orientation
arr3
Numerical
programming ## array([[4, 8, 5],
NumPy package ## [9, 3, 4],
Array basics
## [1, 0, 6]])
Linear algebra

Data formats and


handling
arr3.sort()
Pandas package arr3
Series
DataFrame ## array([[4, 5, 8],
Import/Export data
## [3, 4, 9],
Visual
illustrations
## [0, 1, 6]])
Matplotlib package
Figures and subplots arr3.sort(axis=0)
Plot types and styles arr3
Pandas layers

Applications ## array([[0, 1, 6],


Time series
## [3, 4, 8],
## [4, 5, 9]])
Moving window
Financial applications
Optimization

The default axis using sort() is -1, which means to sort along the
© 2020 PyEcon.org last axis (in this case axis 1).
Section 2.3 157
Essential
concepts
Getting started
Procedural
programming
Object-orientation
Numerical programming
Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


I Linear algebra
handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Inverse matrix 158
Essential
concepts
Getting started
Procedural Import numpy.linalg
programming
Object-orientation import numpy.linalg as nplin
Numerical
programming
NumPy package nplin.inv(array): Computes the inverse matrix.
Array basics
Linear algebra
np.allclose(array1, array2): Returns True if two arrays are ele-
Data formats and
ment-wise equal within a tolerance.
handling
Pandas package
Series
Inverse
DataFrame inv = nplin.inv(arr3)
Import/Export data
inv
Visual
illustrations
Matplotlib package
## array([[ 4., -21., 16.],
Figures and subplots ## [ -5., 24., -18.],
Plot types and styles ## [ 1., -4., 3.]])
Pandas layers

Applications np.allclose(np.identity(3), np.dot(inv, arr3))


Time series
Moving window
## True
Financial applications
Optimization

© 2020 PyEcon.org
Matrix functions 159
Essential
concepts
Getting started nplin.det(array): Computes the determinant.
Procedural
programming np.trace(array): Computes the trace.
Object-orientation
np.diag(array): Returns the diagonal elements as an array.
Numerical
programming
NumPy package Linear algebra functions
Array basics
Linear algebra nplin.det(arr3)
Data formats and
handling ## -1.0
Pandas package
Series
DataFrame
np.trace(arr3)
Import/Export data
## 13
Visual
illustrations
Matplotlib package np.diag(arr3)
Figures and subplots
Plot types and styles
## array([0, 4, 9])
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Eigenvalues and eigenvectors 160
Essential
concepts
Getting started nplin.eig(array): Returns the array of eigenvalues and the array of
Procedural
programming eigenvectors as a list.
Object-orientation

Numerical
programming
Get eigenvalues and eigenvectors
NumPy package
A = np.array([[3, -1, 0], [2, 0, 0], [-2, 2, -1]])
eigenval, eigenvec = nplin.eig(A)
Array basics
Linear algebra

Data formats and


eigenval
handling
Pandas package ## array([-1., 1., 2.])
Series
DataFrame
eigenvec
Import/Export data

Visual
illustrations
## array([[ 0. , -0.40824829, -0.70710678],
Matplotlib package ## [ 0. , -0.81649658, -0.70710678],
Figures and subplots ## [ 1. , -0.40824829, 0. ]])
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Eigenvalues and eigenvectors 161
Essential
concepts
Getting started
Procedural Check eigenvalues and eigenvectors
programming
Object-orientation eigenval * eigenvec
Numerical
programming ## array([[-0. , -0.40824829, -1.41421356],
NumPy package
## [-0. , -0.81649658, -1.41421356],
Array basics
Linear algebra
## [-1. , -0.40824829, 0. ]])
Data formats and
handling np.dot(A, eigenvec)
Pandas package
Series ## array([[ 0. , -0.40824829, -1.41421356],
DataFrame
## [ 0. , -0.81649658, -1.41421356],
Import/Export data
## [-1. , -0.40824829, 0. ]])
Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles        
Pandas layers 3 −1 0 0 0 0
Applications 2 0 0  · 0 = (−1) · 0 =  0 
Time series
Moving window −2 2 −1 1 1 −1
Financial applications
Optimization

© 2020 PyEcon.org
QR decomposition 162
Essential
concepts
Getting started nplin.qr(array): Conducts a QR decomposition and returns Q and
Procedural
programming R as lists.
Object-orientation

Numerical QR decomposition
programming
NumPy package Q, R = nplin.qr(arr3)
Array basics
Q
Linear algebra

Data formats and


handling
## array([[ 0. , 0.98058068, 0.19611614],
Pandas package ## [-0.6 , 0.15689291, -0.78446454],
Series ## [-0.8 , -0.11766968, 0.58834841]])
DataFrame
Import/Export data
R
Visual

## array([[ -5. , -6.4 , -12. ],


illustrations
Matplotlib package
Figures and subplots ## [ 0. , 1.0198039 , 6.07960019],
Plot types and styles ## [ 0. , 0. , 0.19611614]])
Pandas layers

Applications np.allclose(arr3, np.dot(Q, R))


Time series
Moving window
## True
Financial applications
Optimization

© 2020 PyEcon.org
Linearsystem 163
Essential
concepts
Getting started nplin.solve(A, b): Returns the solution of the linearsystem Ax = b.
Procedural
programming
Object-orientation Solve linearsystems
Numerical
programming b = np.array([7, 4, 8])
NumPy package x = nplin.solve(A, b)
Array basics
x
Linear algebra

Data formats and ## array([ 2., -1., -14.])


handling
Pandas package
Series np.allclose(np.dot(A, x), b)
DataFrame
Import/Export data ## True
Visual
illustrations
Matplotlib package
Figures and subplots    
Plot types and styles
Pandas layers
3x1 − 1x2 + 0x3 =7 x1 2
Applications
2x1 − 0x2 + 0x3 = 4 → x2  =  −1 
Time series −2x1 + 2x2 − 1x3 =8 x3 −14
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Overview: Linear algebra 164
Essential
concepts
Getting started
Procedural
programming Function Description
Object-orientation
np.dot Matrix multiplication
Numerical
programming np.trace Sum of the diagonal elements
NumPy package
Array basics np.diag Diagonal elements as an array
Linear algebra
nplin.det Matrix determinant
Data formats and
handling nplin.eig Eigenvalues and eigenvectors
Pandas package
Series
nplin.inv Inverse matrix
DataFrame nplin.qr QR decomposition
Import/Export data

Visual
nplin.solve Solve linearsystem
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Chapter 3 165
Essential
concepts
Getting started
Procedural
programming
Object-orientation
Data formats and handling
Numerical
programming
NumPy package
Array basics
3.1 Pandas package
Linear algebra

Data formats and


3.2 Series
handling
Pandas package 3.3 DataFrame
Series
DataFrame
Import/Export data
3.4 Import/Export data
Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Section 3.1 166
Essential
concepts
Getting started
Procedural
programming
Object-orientation
Data formats and handling
Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


I Pandas package
handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Pandas 167
Essential
concepts
Getting started
Procedural
programming
Object-orientation

Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


handling

The package pandas is a free software library for Python including the
Pandas package
Series
DataFrame
Import/Export data
following features:
Visual Data manipulation and analysis,
illustrations
Matplotlib package DataFrame objects and Series,
Figures and subplots
Plot types and styles
Export and import data from files and web,
Pandas layers

Applications Handling of missing data.


Time series
Moving window → Provides high-performance data structures and data analysis tools.
Financial applications
Optimization

© 2020 PyEcon.org
Motivation 168
Essential
concepts
Getting started With pandas you can import and visualize financial data in only a few
Procedural
programming lines of code.
Object-orientation

Numerical Motivation
programming
NumPy package
import pandas as pd
Array basics import matplotlib.pyplot as plt
Linear algebra

Data formats and fig = plt.figure()


handling
Pandas package
ax = fig.add_subplot(1, 1, 1)
Series dow = pd.read_csv("data/dji.csv", index_col=0, parse_dates=True)
DataFrame close = dow["Close"]
Import/Export data
close.plot(ax=ax)
Visual
illustrations
ax.set_xlabel("Date")
Matplotlib package ax.set_ylabel("Price")
Figures and subplots ax.set_title("DJI")
Plot types and styles
fig.savefig("out/dji.pdf", format="pdf")
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Motivation 169
Essential
concepts
Getting started
Procedural
programming

DJI
Object-orientation

Numerical 27500
programming
NumPy package
Array basics
25000
Linear algebra

Data formats and 22500


handling
Pandas package 20000
Series
DataFrame
17500
Price

Import/Export data

Visual
illustrations 15000
Matplotlib package
Figures and subplots 12500
Plot types and styles
Pandas layers
10000
Applications
Time series
Moving window
7500
Financial applications

6 8 0 2 4 6 8
200 200 201 201 201 201 201
Optimization

Date
© 2020 PyEcon.org
Section 3.2 170
Essential
concepts
Getting started
Procedural
programming
Object-orientation
Data formats and handling
Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


I Series
handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Series 171
Essential
concepts
Getting started Series are a data structure in pandas.
Procedural
programming
Object-orientation
One-dimensional array-like object,
Numerical
programming Containing a sequence of values and a corresponding array of
NumPy package
Array basics
labels, called the index,
Linear algebra
The string representation of a Series displays the index on the left
Data formats and
handling and the values on the right,
Pandas package
Series The default index consists of the integers 0 through N-1.
DataFrame
Import/Export data

Visual
illustrations String representation of a Series
## 0 3
Matplotlib package
Figures and subplots
Plot types and styles ## 1 7
Pandas layers ## 2 -8
Applications ## 3 4
Time series
## 4 26
## dtype: int64
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Create Series 172
Essential
concepts
Getting started pd.Series(): Creates one-dimensional array-like object including val-
Procedural
programming ues and an index.
Object-orientation

Numerical Importing Pandas and creating a Series


programming
NumPy package import numpy as np
Array basics import pandas as pd
Linear algebra

Data formats and


handling
obj = pd.Series([2, -5, 9, 4])
Pandas package obj
Series
DataFrame ## 0 2
Import/Export data
## 1 -5
Visual
illustrations
## 2 9
Matplotlib package ## 3 4
Figures and subplots ## dtype: int64
Plot types and styles
Pandas layers

Applications
Time series Simple Series formed only from a list,
Moving window
Financial applications An index is added automatically.
Optimization

© 2020 PyEcon.org
Create Series 173
Essential
concepts
Getting started
Procedural
Series indexing vs. Numpy indexing
programming
Object-orientation obj2 = pd.Series([2, -5, 9, 4], index=["a", "b", "c", "d"])
Numerical npobj = np.array([2, -5, 9, 4])
programming obj2
NumPy package

## a 2
Array basics
Linear algebra

Data formats and


## b -5
handling ## c 9
Pandas package ## d 4
Series
DataFrame
## dtype: int64
Import/Export data

Visual
obj2["b"]
illustrations
Matplotlib package ## -5
Figures and subplots
Plot types and styles
npobj[1]
Pandas layers

Applications ## -5
Time series
Moving window
Financial applications
Optimization NumPy arrays can only be indexed by integers while Series can be
indexed by the manually set index.
© 2020 PyEcon.org
Create Series 174
Essential
concepts
Getting started Pandas Series can be created from:
Procedural
programming
Object-orientation
Lists,
Numerical NumPy arrays,
programming
NumPy package Dicts.
Array basics
Linear algebra

Data formats and


Series creation from Numpy arrays
handling
Pandas package
npobj = np.array([2, -5, 9, 4])
Series obj2 = pd.Series(npobj, index=["a", "b", "c", "d"])
DataFrame obj2
Import/Export data

Visual ## a 2
illustrations
Matplotlib package
## b -5
Figures and subplots ## c 9
Plot types and styles ## d 4
Pandas layers
## dtype: int64
Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Create Series 175
Essential
concepts
Getting started
Procedural Series from dicts
programming
Object-orientation dictdata = {"Göttingen": 117665, "Northeim": 28920,
Numerical "Hannover": 532163, "Berlin": 3574830}
programming obj3 = pd.Series(dictdata)
NumPy package
obj3
Array basics
Linear algebra
## Göttingen 117665
Data formats and
handling ## Northeim 28920
Pandas package ## Hannover 532163
Series
## Berlin 3574830
DataFrame
Import/Export data
## dtype: int64
Visual
illustrations
Matplotlib package
Figures and subplots The index of the Series can be set manually,
Plot types and styles
Pandas layers Compared to NumPy array you can use the set index to select
Applications single values,
Time series
Moving window Data contained in a dict can be passed to a Series. The index of
Financial applications
Optimization
the resulting Series consists of the dict’s keys.

© 2020 PyEcon.org
Create Series 176
Essential
concepts
Getting started
Procedural Dict to Series with manual index
programming
Object-orientation cities = ["Hamburg", "Göttingen", "Berlin", "Hannover"]
Numerical obj4 = pd.Series(dictdata, index=cities)
programming obj4
NumPy package
Array basics
Linear algebra
## Hamburg NaN
## Göttingen 117665.0
Data formats and
handling ## Berlin 3574830.0
Pandas package ## Hannover 532163.0
Series
## dtype: float64
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Passing a dict to a Series, the index can be set manually,
Figures and subplots
Plot types and styles
NaN (not a number) marks missing values where the index and the
Pandas layers dict do not match.
Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Series properties 177
Essential
concepts
Getting started Series.values: Returns the values of a Series.
Procedural
programming Series.index: Returns the index of a Series.
Object-orientation

Numerical Series properties


programming
NumPy package obj.values
Array basics
Linear algebra
## array([ 2, -5, 9, 4])
Data formats and
handling
obj.index
Pandas package
Series
DataFrame ## RangeIndex(start=0, stop=4, step=1)
Import/Export data

Visual obj2.index
illustrations
Matplotlib package
## Index(['a', 'b', 'c', 'd'], dtype='object')
Figures and subplots
Plot types and styles
Pandas layers

Applications The values and the index of a Series can be printed separately.
Time series
Moving window The default index, if none was explicitly specified, is a RangeIndex.
Financial applications
Optimization RangeIndex inherits from Index class.

© 2020 PyEcon.org
Selecting and manipulating values 178
Essential
concepts
Getting started
Procedural Series manipulation
programming
Object-orientation obj2[["c", "d", "a"]]
Numerical
programming ## c 9
NumPy package
Array basics
## d 4
Linear algebra ## a 2
Data formats and
## dtype: int64
handling
Pandas package obj2[obj2 < 0]
Series
DataFrame
Import/Export data
## b -5
## dtype: int64
Visual
illustrations
Matplotlib package
Figures and subplots
NumPy-like functions can be applied on Series
Plot types and styles
Pandas layers
For filtering data,
Applications To do scalar multiplications or applying math functions,
Time series
Moving window The index-value link will be preserved.
Financial applications
Optimization

© 2020 PyEcon.org
Selecting and manipulating values 179
Essential
concepts
Getting started
Procedural
Series functions
programming
Object-orientation
obj2 * 2
Numerical
programming
## a 4
NumPy package ## b -10
Array basics ## c 18
Linear algebra
## d 8
Data formats and
handling
## dtype: int64
Pandas package
Series np.exp(obj2)["a":"c"]
DataFrame
Import/Export data ## a 7.389056
Visual ## b 0.006738
illustrations
Matplotlib package
## c 8103.083928
Figures and subplots ## dtype: float64
Plot types and styles
Pandas layers "c" in obj2
Applications
Time series ## True
Moving window
Financial applications
Optimization
Mathematical functions applied to a Series will only be applied on
its values – not on its index.
© 2020 PyEcon.org
Selecting and manipulating values 180
Essential
concepts
Getting started
Procedural
Series manipulation
programming
Object-orientation obj4["Hamburg"] = 1900000
Numerical obj4
programming
NumPy package ## Hamburg 1900000.0
Array basics
Linear algebra
## Göttingen 117665.0
## Berlin 3574830.0
Data formats and
handling ## Hannover 532163.0
Pandas package ## dtype: float64
Series
DataFrame
Import/Export data
obj4[["Berlin", "Hannover"]] = [3600000, 1100000]
obj4
Visual
illustrations
Matplotlib package ## Hamburg 1900000.0
Figures and subplots ## Göttingen 117665.0
Plot types and styles
Pandas layers
## Berlin 3600000.0
## Hannover 1100000.0
Applications
Time series
## dtype: float64
Moving window
Financial applications
Optimization
Values can be manipulated by using the labels in the index,
Sets of values can be set in one line.
© 2020 PyEcon.org
Detect missing data 181
Essential
concepts
Getting started pd.isnull(): True if data is missing.
Procedural
programming pd.notnull(): False if data is missing.
Object-orientation

Numerical
programming
NaN
NumPy package pd.isnull(obj4)
Array basics
Linear algebra
## Hamburg False
Data formats and
handling
## Göttingen False
Pandas package ## Berlin False
Series ## Hannover False
DataFrame
## dtype: bool
Import/Export data

Visual
illustrations
pd.notnull(obj4)
Matplotlib package
Figures and subplots ## Hamburg True
Plot types and styles ## Göttingen True
Pandas layers
## Berlin True
Applications ## Hannover True
Time series
Moving window
## dtype: bool
Financial applications
Optimization

© 2020 PyEcon.org
Align differently indexed data 182
Essential
concepts
Getting started There are not two values to align for Hamburg and Northeim – so they
Procedural
programming are marked with NaN (not a number).
Object-orientation

Numerical
programming
NumPy package
Data 1 Data 2
Array basics
obj3 obj4
Linear algebra

Data formats and


handling
## Göttingen 117665 ## Hamburg 1900000.0
Pandas package
## Northeim 28920 ## Göttingen 117665.0
Series ## Hannover 532163 ## Berlin 3600000.0
DataFrame
## Berlin 3574830 ## Hannover 1100000.0
## dtype: int64 ## dtype: float64
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Align data
Plot types and styles
obj3 + obj4
Pandas layers

Applications ## Berlin 7174830.0


Time series
Moving window
## Göttingen 235330.0
Financial applications ## Hamburg NaN
Optimization ## Hannover 1632163.0
## Northeim NaN
## dtype: float64
© 2020 PyEcon.org
Naming Series 183
Essential
concepts
Getting started Series.name: Returns name of the Series.
Procedural
programming Series.index.name: Returns name of the Series’ index.
Object-orientation

Numerical Naming
programming
NumPy package obj4.name = "population"
Array basics
obj4.index.name = "city"
obj4
Linear algebra

Data formats and


handling
Pandas package
## city
Series ## Hamburg 1900000.0
DataFrame ## Göttingen 117665.0
## Berlin 3600000.0
Import/Export data

Visual
illustrations
## Hannover 1100000.0
Matplotlib package ## Name: population, dtype: float64
Figures and subplots
Plot types and styles
Pandas layers

Applications The attribute name will change the name of the existing Series,
Time series
Moving window There is no default name of the Series or the index.
Financial applications
Optimization

© 2020 PyEcon.org
Series vs. NumPy arrays 184
Essential
concepts
Getting started
Procedural
programming
NumPy arrays are accessed by their integer positions,
Object-orientation
Series can be accessed by a user defined index, including letters
Numerical
programming and numbers,
NumPy package
Array basics Different Series can be aligned efficiently by the index,
Linear algebra

Data formats and Series can work with missing values, so operations do not auto-
handling
Pandas package
matically fail.
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Section 3.3 185
Essential
concepts
Getting started
Procedural
programming
Object-orientation
Data formats and handling
Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


I DataFrame
handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
DataFrame 186
Essential
concepts
Getting started
Procedural
programming
DataFrames are the primary structure of pandas,
Object-orientation
It represents a table of data with an ordered collection of columns,
Numerical
programming
NumPy package
Each column can have a different data type,
Array basics
Linear algebra
A DataFrame can be thought of as a dict of Series sharing the
Data formats and same index,
handling
Pandas package Physically a DataFrame is two-dimensional but by using hierarchical
Series
DataFrame
indexing it can respresent higher dimensional data.
Import/Export data

Visual
illustrations String representation of a DataFrame
Matplotlib package
Figures and subplots ## company price volume
Plot types and styles
## 0 Daimler 69.20 4456290
## 1 E.ON 8.11 3667975
Pandas layers

Applications
## 2 Siemens 110.92 3669487
Time series
Moving window
## 3 BASF 87.28 1778058
Financial applications ## 4 BMW 87.81 1824582
Optimization

© 2020 PyEcon.org
DataFrame 187
Essential
concepts
Getting started pd.DataFrame(): Creates a DataFrame which is a two-dimensional
Procedural
programming tabular-like structure with labeled axis (rows and columns).
Object-orientation

Numerical Creating a DataFrame


programming
NumPy package data = {"company": ["Daimler", "E.ON", "Siemens", "BASF", "BMW"],
Array basics
"price": [69.2, 8.11, 110.92, 87.28, 87.81],
Linear algebra
"volume": [4456290, 3667975, 3669487, 1778058, 1824582]}
Data formats and
handling frame = pd.DataFrame(data)
Pandas package frame
Series
DataFrame
## company price volume
Import/Export data
## 0 Daimler 69.20 4456290
## 1 E.ON 8.11 3667975
Visual
illustrations
Matplotlib package ## 2 Siemens 110.92 3669487
Figures and subplots
## 3 BASF 87.28 1778058
## 4 BMW 87.81 1824582
Plot types and styles
Pandas layers

Applications

In this example the construction of the DataFrame frame is done


Time series
Moving window
Financial applications
Optimization
by passing a dict of equal-length lists,
Instead of passing a dict of lists, it is also possible to pass a dict
of NumPy arrays.
© 2020 PyEcon.org
Show DataFrames 188
Essential
concepts
Getting started
Procedural Print DataFrame
programming
Object-orientation frame2 = pd.DataFrame(data, columns=["company", "volume",
Numerical "price", "change"])
programming frame2
NumPy package
Array basics
Linear algebra
## company volume price change
## 0 Daimler 4456290 69.20 NaN
Data formats and
handling ## 1 E.ON 3667975 8.11 NaN
Pandas package ## 2 Siemens 3669487 110.92 NaN
Series
## 3 BASF 1778058 87.28 NaN
DataFrame
Import/Export data
## 4 BMW 1824582 87.81 NaN
Visual
illustrations
Matplotlib package
Figures and subplots
Passing a column that is not contained in the dict, it will be
Plot types and styles marked with NaN,
Pandas layers

Applications The default index will be assigned automatically as with Series.


Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Inputs to DataFrame constructor 189
Essential
concepts
Getting started
Procedural
programming Type Description
Object-orientation
2D NumPy arrays A matrix of data
Numerical
programming dict of arrays, lists, or tuples Each sequence becomes a column
NumPy package
Array basics dict of Series Each value becomes a column
Linear algebra
dict of dicts Each inner dict becomes a column
Data formats and
handling List of dicts or Series Each item becomes a row
Pandas package
Series
List of lists or tuples Treated as the 2D NumPy arrays
DataFrame Another DataFrame Same indexes
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Indexing and adding DataFrames 190
Essential
concepts
Getting started
Procedural Add data to DataFrame
programming
Object-orientation frame2["change"] = [1.2, -3.2, 0.4, -0.12, 2.4]
Numerical frame2["change"]
programming
NumPy package ## 0 1.20
Array basics
Linear algebra
## 1 -3.20
## 2 0.40
Data formats and
handling ## 3 -0.12
Pandas package ## 4 2.40
Series
## Name: change, dtype: float64
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Selecting the column of DataFrame, a Series is returned,
Figures and subplots
Plot types and styles
A attribute-like access, e.g., frame2.change, is also possible,
Pandas layers
The returned Series has the same index as the initial DataFrame.
Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Indexing DataFrames 191
Essential
concepts
Getting started
Procedural Indexing DataFrames
programming
Object-orientation frame2[["company", "change"]]
Numerical
programming ## company change
NumPy package
Array basics
## 0 Daimler 1.20
Linear algebra ## 1 E.ON -3.20
Data formats and
## 2 Siemens 0.40
handling ## 3 BASF -0.12
Pandas package
## 4 BMW 2.40
Series
DataFrame
Import/Export data

Visual Using a list of multiple columns while indexing, the result is a


illustrations
Matplotlib package DataFrame,
Figures and subplots
Plot types and styles The returned DataFrame has the same index as the initial one.
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Changing DataFrames 192
Essential
concepts
Getting started del DataFrame[column]: Deletes column from DataFrame.
Procedural
programming
Object-orientation DataFrame delete column
Numerical
programming
del frame2["volume"]
NumPy package frame2
Array basics
Linear algebra ## company price change
Data formats and ## 0 Daimler 69.20 1.20
handling
Pandas package
## 1 E.ON 8.11 -3.20
Series ## 2 Siemens 110.92 0.40
DataFrame ## 3 BASF 87.28 -0.12
Import/Export data
## 4 BMW 87.81 2.40
Visual
illustrations
frame2.columns
Matplotlib package
Figures and subplots
Plot types and styles ## Index(['company', 'price', 'change'], dtype='object')
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Naming DataFrames 193
Essential
concepts
Getting started
Procedural Naming properties
programming
Object-orientation frame2.index.name = "number:"
Numerical frame2.columns.name = "feature:"
programming
frame2
NumPy package
Array basics
Linear algebra
## feature: company price change
Data formats and
## number:
handling ## 0 Daimler 69.20 1.20
Pandas package
## 1 E.ON 8.11 -3.20
## 2 Siemens 110.92 0.40
Series
DataFrame
Import/Export data ## 3 BASF 87.28 -0.12
Visual ## 4 BMW 87.81 2.40
illustrations
Matplotlib package
Figures and subplots
Plot types and styles In DataFrames there is no default name for the index or the
Pandas layers
columns.
Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Reindexing 194
Essential
concepts
Getting started DataFrame.reindex(): Creates new DataFrame with data conformed
Procedural
programming to a new index, while the initial DataFrame will not be changed.
Object-orientation

Numerical
programming
Reindexing
NumPy package
frame3 = frame.reindex([0, 2, 3, 4])
frame3
Array basics
Linear algebra

Data formats and


handling ## company price volume
Pandas package ## 0 Daimler 69.20 4456290
Series
## 2 Siemens 110.92 3669487
## 3 BASF 87.28 1778058
DataFrame
Import/Export data
## 4 BMW 87.81 1824582
Visual
illustrations
Matplotlib package

Index values that are not already present will be filled with NaN by
Figures and subplots
Plot types and styles
Pandas layers
default,
Applications
Time series There are many options for filling missing values.
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Reindexing 195
Essential
concepts
Getting started
Procedural Filling missing values
programming
Object-orientation frame4 = frame.reindex(index=[0, 2, 3, 4, 5], fill_value=0,
Numerical columns=["company", "price", "market cap"])
programming
frame4
NumPy package
Array basics
Linear algebra ## company price market cap
Data formats and
## 0 Daimler 69.20 0
handling ## 2 Siemens 110.92 0
Pandas package
## 3 BASF 87.28 0
## 4 BMW 87.81 0
Series
DataFrame
Import/Export data ## 5 0 0.00 0
Visual
illustrations frame4 = frame.reindex(index=[0, 2, 3, 4], fill_value=np.nan,
Matplotlib package
columns=["company", "price", "market cap"])
frame4
Figures and subplots
Plot types and styles
Pandas layers

Applications
## company price market cap
Time series ## 0 Daimler 69.20 NaN
Moving window ## 2 Siemens 110.92 NaN
Financial applications
## 3 BASF 87.28 NaN
Optimization
## 4 BMW 87.81 NaN

© 2020 PyEcon.org
Fill NaN 196
Essential
concepts
Getting started DataFrame.fillna(value): Fills NaNs with value.
Procedural
programming
Object-orientation Filling NaN
Numerical
programming
frame4[:3]
NumPy package
Array basics ## company price market cap
Linear algebra ## 0 Daimler 69.20 NaN
Data formats and ## 2 Siemens 110.92 NaN
handling
## 3 BASF 87.28 NaN
Pandas package
Series
DataFrame frame4.fillna(1000000, inplace=True)
Import/Export data frame4[:3]
Visual
illustrations ## company price market cap
Matplotlib package
Figures and subplots
## 0 Daimler 69.20 1000000.0
Plot types and styles ## 2 Siemens 110.92 1000000.0
Pandas layers ## 3 BASF 87.28 1000000.0
Applications
Time series
Moving window
Financial applications
The option inplace=True fills the current DafaFrame (here
Optimization frame4). Without using inplace a new DataFrame will be cre-
ated, filled with NaN values.
© 2020 PyEcon.org
Dropping entries 197
Essential
concepts
Getting started DataFrame.drop(index, axis): Returns a new object with labels in
Procedural
programming requested axis removed.
Object-orientation

Numerical
programming
Dropping index
NumPy package frame5 = frame
Array basics
Linear algebra
frame5
Data formats and
handling
## company price volume
Pandas package ## 0 Daimler 69.20 4456290
Series ## 1 E.ON 8.11 3667975
DataFrame
Import/Export data
## 2 Siemens 110.92 3669487
## 3 BASF 87.28 1778058
Visual
illustrations ## 4 BMW 87.81 1824582
Matplotlib package
Figures and subplots frame5.drop([1, 2])
Plot types and styles

## company price volume


Pandas layers

Applications
## 0 Daimler 69.20 4456290
Time series
Moving window
## 3 BASF 87.28 1778058
Financial applications ## 4 BMW 87.81 1824582
Optimization

© 2020 PyEcon.org
Dropping entries 198
Essential
concepts
Getting started
Procedural Dropping column
programming
Object-orientation frame5[:2]
Numerical
programming ## company price volume
NumPy package
Array basics
## 0 Daimler 69.20 4456290
Linear algebra ## 1 E.ON 8.11 3667975
Data formats and
handling frame5.drop("price", axis=1)[:3]
Pandas package
Series ## company volume
DataFrame
Import/Export data
## 0 Daimler 4456290
## 1 E.ON 3667975
Visual
illustrations ## 2 Siemens 3669487
Matplotlib package
Figures and subplots frame5.drop(2, axis=0)
Plot types and styles

## company price volume


Pandas layers

Applications
## 0 Daimler 69.20 4456290
Time series
Moving window
## 1 E.ON 8.11 3667975
Financial applications ## 3 BASF 87.28 1778058
Optimization
## 4 BMW 87.81 1824582

© 2020 PyEcon.org
Indexing, selecting and filtering 199
Essential
concepts
Getting started Indexing of DataFrames works like indexing an numpy array, you can
Procedural
programming use the default index values and a manually set index.
Object-orientation

Numerical
programming
Indexing
NumPy package frame
Array basics

## company price volume


Linear algebra

Data formats and


handling
## 0 Daimler 69.20 4456290
Pandas package ## 1 E.ON 8.11 3667975
Series ## 2 Siemens 110.92 3669487
DataFrame
Import/Export data
## 3 BASF 87.28 1778058
## 4 BMW 87.81 1824582
Visual
illustrations
Matplotlib package frame[2:]
Figures and subplots
Plot types and styles ## company price volume
## 2 Siemens 110.92 3669487
Pandas layers

Applications
## 3 BASF 87.28 1778058
Time series
Moving window
## 4 BMW 87.81 1824582
Financial applications
Optimization

© 2020 PyEcon.org
Indexing, selecting and filtering 200
Essential
concepts
Getting started
Procedural Indexing
programming
Object-orientation frame6 = pd.DataFrame(data, index=["a", "b", "c", "d", "e"])
Numerical frame6
programming
NumPy package
Array basics
## company price volume
Linear algebra ## a Daimler 69.20 4456290
Data formats and
## b E.ON 8.11 3667975
handling ## c Siemens 110.92 3669487
Pandas package
## d BASF 87.28 1778058
## e BMW 87.81 1824582
Series
DataFrame
Import/Export data

Visual
frame6["b":"d"]
illustrations
Matplotlib package ## company price volume
Figures and subplots
## b E.ON 8.11 3667975
Plot types and styles
Pandas layers
## c Siemens 110.92 3669487
Applications
## d BASF 87.28 1778058
Time series
Moving window
Financial applications
Optimization
When slicing with labels the end element is inclusive.

© 2020 PyEcon.org
Indexing, selecting and filtering 201
Essential
concepts
Getting started DataFrame.loc(): Selects a subset of rows and columns from a
Procedural
programming DataFrame using axis labels.
Object-orientation
DataFrame.iloc(): Selects a subset of rows and columns from a
Numerical
programming DataFrame using integers.
NumPy package
Array basics
Linear algebra
Selection with loc and iloc
Data formats and frame6.loc["c", ["company", "price"]]
handling
Pandas package
## company Siemens
Series
DataFrame
## price 110.92
Import/Export data ## Name: c, dtype: object
Visual
illustrations frame6.iloc[2, [0, 1]]
Matplotlib package
Figures and subplots
Plot types and styles
## company Siemens
Pandas layers ## price 110.92
Applications
## Name: c, dtype: object
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Indexing, selecting and filtering 202
Essential
concepts
Getting started
Procedural Selection with loc and iloc
programming
Object-orientation frame6.loc[["c", "d", "e"], ["volume", "price", "company"]]
Numerical
programming ## volume price company
NumPy package ## c 3669487 110.92 Siemens
Array basics
Linear algebra
## d 1778058 87.28 BASF
## e 1824582 87.81 BMW
Data formats and
handling
Pandas package frame6.iloc[2:, ::-1]
Series
DataFrame ## volume price company
## c 3669487 110.92 Siemens
Import/Export data

Visual
illustrations
## d 1778058 87.28 BASF
Matplotlib package ## e 1824582 87.81 BMW
Figures and subplots
Plot types and styles
Pandas layers

Applications
Both of the indexing functions work with slices or lists of labels,
Time series
Moving window
Many ways to select and rearrange pandas objects.
Financial applications
Optimization

© 2020 PyEcon.org
DataFrame indexing options 203
Essential
concepts
Getting started
Procedural
programming
Object-orientation
Type Description
Numerical
programming df[val] Select single column or set of columns
NumPy package
Array basics df.loc[val] Select single row or set of rows
Linear algebra
df.loc[:, val] Select single column or set of columns
Data formats and
handling df.loc[val1, val2] Select row and column by label
Pandas package
Series
df.iloc[where] Select row or set of rows by integer position
DataFrame df.iloc[:, where] Select column or set of columns by integer pos.
Import/Export data

Visual
df.iloc[w1, w2] Select row and column by integer position
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Hierarchical indexing 204
Essential
concepts
Getting started Hierarchical indexing enables you to have multiple index levels.
Procedural
programming
Object-orientation
Multiindex
Numerical ind = [["a", "a", "a", "b", "b"], [1, 2, 3, 1, 2]]
programming
frame6 = pd.DataFrame(np.arange(15).reshape((5, 3)), index=ind,
NumPy package
Array basics columns=["first", "second", "third"])
Linear algebra frame6
Data formats and
handling ## first second third
Pandas package
## a 1 0 1 2
Series
DataFrame
## 2 3 4 5
Import/Export data ## 3 6 7 8
Visual ## b 1 9 10 11
illustrations
## 2 12 13 14
Matplotlib package
Figures and subplots
Plot types and styles
frame6.index.names = ["index1", "index2"]
Pandas layers frame6.index
Applications
Time series ## MultiIndex([('a', 1),
Moving window ## ('a', 2),
Financial applications
Optimization
## ('a', 3),
## ('b', 1),
## ('b', 2)],
© 2020 PyEcon.org ## names=['index1', 'index2'])
Hierarchical indexing 205
Essential
concepts
Getting started
Procedural Selecting of a multiindex
programming
Object-orientation frame6.loc["a"]
Numerical
programming ## first second third
NumPy package
Array basics
## index2
Linear algebra ## 1 0 1 2
Data formats and
## 2 3 4 5
handling ## 3 6 7 8
Pandas package
Series
frame6.loc["b", 1]
DataFrame
Import/Export data
## first 9
Visual
illustrations ## second 10
Matplotlib package ## third 11
Figures and subplots
## Name: (b, 1), dtype: int64
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Operations between DataFrame and Series 206
Essential
concepts
Getting started
Procedural Series and DataFrames
programming
Object-orientation frame7 = frame[["price", "volume"]]
Numerical frame7.index = ["Daimler", "E.ON", "Siemens", "BASF", "BMW"]
programming series = frame7.iloc[2]
NumPy package
frame7
Array basics
Linear algebra
## price volume
Data formats and
handling ## Daimler 69.20 4456290
Pandas package ## E.ON 8.11 3667975
Series
## Siemens 110.92 3669487
DataFrame
Import/Export data
## BASF 87.28 1778058
Visual
## BMW 87.81 1824582
illustrations
Matplotlib package series
Figures and subplots
Plot types and styles
## price 110.92
Pandas layers
## volume 3669487.00
Applications
Time series
## Name: Siemens, dtype: float64
Moving window
Financial applications
Optimization
Here the Series was generated from the first row of the DataFrame.
© 2020 PyEcon.org
Operations between DataFrames and Series 207
Essential
concepts
Getting started
Procedural Operations between Series and DataFrames down the rows
programming
Object-orientation frame7 + series
Numerical
programming ## price volume
NumPy package
## Daimler 180.12 8125777.0
Array basics
Linear algebra
## E.ON 119.03 7337462.0
Data formats and
## Siemens 221.84 7338974.0
handling ## BASF 198.20 5447545.0
Pandas package ## BMW 198.73 5494069.0
Series
DataFrame
Import/Export data

Visual By default arithmetic operations between DataFrames and Series


illustrations
Matplotlib package
match the index of the Series on the DataFrame’s columns,
Figures and subplots
Plot types and styles The operations will be broadcasted along the rows.
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Operations between DataFrames and Series 208
Essential
concepts
Getting started
Procedural Operations between Series and DataFrames down the columns
programming
Object-orientation series2 = frame7["price"]
Numerical frame7.add(series2, axis=0)
programming
NumPy package
Array basics
## price volume
Linear algebra ## Daimler 138.40 4456359.20
Data formats and
## E.ON 16.22 3667983.11
handling ## Siemens 221.84 3669597.92
Pandas package
## BASF 174.56 1778145.28
## BMW 175.62 1824669.81
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Here, the Series was generated from the price column,
Figures and subplots
Plot types and styles
The arithmetic operation will be broadcasted along a column
Pandas layers matching the DataFrame’s row index (axis=0).
Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Operations between DataFrames and Series 209
Essential
concepts
Getting started
Procedural Pandas vs Numpy
programming
Object-orientation nparr = np.arange(12.).reshape((3, 4))
Numerical row = nparr[0]
programming
nparr - row
NumPy package
Array basics
Linear algebra ## array([[0., 0., 0., 0.],
Data formats and
## [4., 4., 4., 4.],
handling ## [8., 8., 8., 8.]])
Pandas package
Series
DataFrame
Import/Export data
Operations between DataFrames are similar to operations between
Visual
illustrations one- and two-dimensional Numpy arrays,
Matplotlib package
Figures and subplots
As in DataFrames and Series the arithmetic operations will be
Plot types and styles
Pandas layers
broadcasted along the rows.
Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
NumPy functions on DataFrames 210
Essential
concepts
Getting started DataFrame.apply(np.function, axis): Applies a NumPy function
Procedural
programming on the DataFrame axis. See also statistical and mathematical NumPy
Object-orientation
functions.
Numerical
programming
NumPy package
Numpy functions on DataFrames
Array basics
Linear algebra
frame7[:2]
Data formats and
handling ## price volume
Pandas package ## Daimler 69.20 4456290
Series
## E.ON 8.11 3667975
DataFrame
Import/Export data
frame7.apply(np.mean)
Visual
illustrations
Matplotlib package ## price 72.664
Figures and subplots ## volume 3079278.400
Plot types and styles
## dtype: float64
Pandas layers

Applications
frame7.apply(np.sqrt)[:2]
Time series
Moving window
Financial applications
## price volume
Optimization ## Daimler 8.318654 2110.992657
## E.ON 2.847806 1915.195812
© 2020 PyEcon.org
Grouping DataFrames 211
Essential
concepts
Getting started DataFrame.groupby(col1, col2): Groups DataFrame by columns
Procedural
programming (grouping by one or more than two columns is also possible). See also
Object-orientation
how to import data from CSV files.
Numerical
programming
NumPy package Groupby
Array basics
Linear algebra vote = pd.read_csv("data/vote.csv")[["Party", "Member", "Vote"]]
Data formats and vote.head()
handling
Pandas package
## Party Member Vote
## 0 CDU/CSU Abercron yes
Series
DataFrame
Import/Export data ## 1 CDU/CSU Albani yes
Visual ## 2 CDU/CSU Altenkamp yes
illustrations ## 3 CDU/CSU Altmaier absent
Matplotlib package
Figures and subplots
## 4 CDU/CSU Amthor yes
Plot types and styles
Pandas layers
Adding the functions count() or mean() to groupby() returns the
Applications
Time series sum or the mean of the grouped columns.
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Grouping DataFrames 212
Essential
concepts
Getting started
Procedural Groupby
programming
Object-orientation res = vote.groupby(["Party", "Vote"]).count()
Numerical res
programming
NumPy package
Array basics
## Member
Linear algebra ## Party Vote
Data formats and
## AfD absent 6
handling ## no 86
Pandas package
## BÜ90/GR absent 9
## no 58
Series
DataFrame
Import/Export data ## CDU/CSU absent 7
Visual ## yes 239
illustrations ## DIE LINKE. absent 7
Matplotlib package
Figures and subplots
## no 62
Plot types and styles ## FDP absent 5
Pandas layers ## no 75
Applications ## Fraktionslos absent 1
Time series ## no 1
Moving window
Financial applications
## SPD absent 6
Optimization ## yes 147

© 2020 PyEcon.org
Section 3.4 213
Essential
concepts
Getting started
Procedural
programming
Object-orientation
Data formats and handling
Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


I Import/Export data
handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Reading data in text format 214
Essential
concepts
Getting started ex1.csv
Procedural
programming
Object-orientation a, b, c, d, hello
Numerical
programming
1, 2, 3, 4, world
NumPy package 5, 6, 7, 8, python
2, 3, 5, 7, pandas
Array basics
Linear algebra

Data formats and


handling
Pandas package pd.read_csv("file"): Reads CSV into DataFrame.
Series
DataFrame
Import/Export data Read comma-separated values
Visual
illustrations
df = pd.read_csv("data/ex1.csv")
Matplotlib package df
Figures and subplots
Plot types and styles ## a b c d hello
Pandas layers
## 0 1 2 3 4 world
Applications ## 1 5 6 7 8 python
Time series
Moving window
## 2 2 3 5 7 pandas
Financial applications
Optimization

© 2020 PyEcon.org
Reading data in text format 215
Essential
concepts
Getting started tab.txt
Procedural
programming
Object-orientation a| b| c| d| hello
Numerical
programming
1| 2| 3| 4| world
NumPy package 5| 6| 7| 8| python
2| 3| 5| 7| pandas
Array basics
Linear algebra

Data formats and


handling
Pandas package pd.read_table("file", sep): Reads table with any seperators into
Series
DataFrame DataFrame.
Import/Export data

Visual Read table values


illustrations
Matplotlib package
df = pd.read_table("data/tab.txt", sep="|")
Figures and subplots df
Plot types and styles
Pandas layers
## a b c d hello
Applications ## 0 1 2 3 4 world
Time series
## 1 5 6 7 8 python
Moving window
Financial applications ## 2 2 3 5 7 pandas
Optimization

© 2020 PyEcon.org
Reading data in text format 216
Essential
concepts
Getting started ex2.csv
Procedural
programming
Object-orientation 1, 2, 3, 4, world
Numerical
programming
5, 6, 7, 8, python
NumPy package 2, 3, 5, 7, pandas
Array basics
Linear algebra

Data formats and


handling
CSV file without header row:
Pandas package
Series Read CSV and header settings
DataFrame
Import/Export data df = pd.read_csv("data/ex2.csv", header=None)
Visual df
illustrations
Matplotlib package ## 0 1 2 3 4
Figures and subplots
Plot types and styles
## 0 1 2 3 4 world
Pandas layers ## 1 5 6 7 8 python
Applications
## 2 2 3 5 7 pandas
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Reading data in text format 217
Essential
concepts
Getting started ex2.csv
Procedural
programming
Object-orientation 1, 2, 3, 4, world
Numerical
programming
5, 6, 7, 8, python
NumPy package 2, 3, 5, 7, pandas
Array basics
Linear algebra

Data formats and


handling
Specify header:
Pandas package
Series Read CSV and header names
DataFrame
Import/Export data df = pd.read_csv("data/ex2.csv",
Visual names=["a", "b", "c", "d", "hello"])
illustrations df
Matplotlib package
Figures and subplots
Plot types and styles
## a b c d hello
Pandas layers ## 0 1 2 3 4 world
Applications
## 1 5 6 7 8 python
Time series ## 2 2 3 5 7 pandas
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Reading data in text format 218
Essential
concepts
Getting started ex2.csv
Procedural
programming
Object-orientation 1, 2, 3, 4, world
Numerical
programming
5, 6, 7, 8, python
NumPy package 2, 3, 5, 7, pandas
Array basics
Linear algebra

Data formats and


handling
Use hello-column as the index:
Pandas package
Series Read CSV and specify index
DataFrame
Import/Export data df = pd.read_csv("data/ex2.csv",
Visual names=["a", "b", "c", "d", "hello"],
illustrations index_col="hello")
Matplotlib package
df
Figures and subplots
Plot types and styles
Pandas layers ## a b c d
Applications
## hello
Time series ## world 1 2 3 4
Moving window ## python 5 6 7 8
Financial applications
Optimization
## pandas 2 3 5 7

© 2020 PyEcon.org
Reading data in text format 219
Essential
concepts
Getting started ex3.csv
Procedural
programming
Object-orientation 1, 2, 3, 4, world
Numerical
programming
#+#-.,.-'*'-.,
NumPy package 5, 6, 7, 8, python
87646756754456978
Array basics
Linear algebra

Data formats and 2, 3, 5, 7, pandas


handling
Pandas package
Series
DataFrame
Skip rows while reading:
Import/Export data

Visual Read CSV and choose rows


illustrations
Matplotlib package df = pd.read_csv("data/ex3.csv", skiprows=[1, 3])
Figures and subplots df
Plot types and styles
Pandas layers
## 1 2 3 4 world
Applications ## 0 5 6 7 8 python
## 1 2 3 5 7 pandas
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Writing data to text file 220
Essential
concepts
Getting started DataFrame.to_csv("filename"): Writes DataFrame to CSV.
Procedural
programming
Object-orientation Write to CSV
Numerical df = pd.read_csv("data/ex3.csv", skiprows=[1, 3])
programming
NumPy package
df.to_csv("out/out1.csv")
Array basics
Linear algebra
out1.csv
Data formats and
handling
Pandas package ,1, 2, 3, 4, world
0,5,6,7,8, python
Series
DataFrame
Import/Export data
1,2,3,5,7, pandas
Visual
illustrations
Matplotlib package
Figures and subplots In the .csv file, the index and header is included (reason why ,1).
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Writing data to text file 221
Essential
concepts
Getting started
Procedural Write to CSV and settings
programming
Object-orientation df = pd.read_csv("data/ex3.csv", skiprows=[1, 3])
Numerical df.to_csv("out/out2.csv", index=False, header=False)
programming
NumPy package
Array basics out2.csv
Linear algebra

Data formats and


handling
5,6,7,8, python
Pandas package 2,3,5,7, pandas
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Writing data to text file 222
Essential
concepts
Getting started
Procedural Write to CSV and specify header
programming
Object-orientation df = pd.read_csv("data/ex3.csv", skiprows=[1, 3, 4])
Numerical df.to_csv("out/out3.csv", index=False,
programming
header=["a", "b", "c", "d", "e"])
NumPy package
Array basics
Linear algebra
out3.csv
Data formats and
handling
Pandas package a,b,c,d,e
Series
DataFrame
5,6,7,8, python
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Reading Excel files 223
Essential
concepts
Getting started pd.read_excel("file.xls"): Reads .xls files.
Procedural
programming
Object-orientation

Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


handling
Pandas package
Series
DataFrame
Import/Export data
Figure: goog.xls
Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles Reading Excel
Pandas layers
xls_frame = pd.read_excel("data/goog.xls")
Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Reading Excel files 224
Essential
concepts
Getting started
Procedural Excel as a DataFrame
programming
Object-orientation xls_frame[["Adj Close", "Volume", "High"]]
Numerical
programming ## Adj Close Volume High
NumPy package ## 0 1169.939941 1538700 1173.000000
Array basics
Linear algebra
## 1 1167.699951 2412100 1174.000000
## 2 1111.900024 4857900 1123.069946
Data formats and
handling ## 3 1055.800049 3798300 1110.000000
Pandas package ## 4 1080.599976 3448000 1081.709961
Series
## 5 1048.579956 2341700 1081.780029
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Remote data access 225
Essential
concepts
Getting started Extract financial data from Internet sources into a DataFrame. There
Procedural
programming are different sources offering different kind of data. Some sources are:
Object-orientation

Numerical
Robinhood
programming
NumPy package IEX
Array basics
Linear algebra Yahoo Finance
Data formats and
handling
World Bank
Pandas package
Series
OECD
DataFrame
Import/Export data
Eurostat
Visual
illustrations
A complete list of the sources and the usage can be found here:
Matplotlib package pandas-datareader
Figures and subplots
Plot types and styles
Pandas layers
Import pandas-datareader
Applications from pandas_datareader import data
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Data access: Yahoo Finance 226
Essential
concepts
Getting started data.DataReader("symbol", "source", "start", "end"): Returns
Procedural
programming financial data of a stock in a certain time period.
Object-orientation

Numerical
programming
Get data of Ford
NumPy package ford = data.DataReader("F", "yahoo", "2020-01-01", "2020-01-31")
Array basics
ford.head()[["Close", "Volume"]]
Linear algebra

Data formats and


handling
## Close Volume
Pandas package ## Date
Series ## 2020-01-02 9.42 43425700.0
DataFrame
## 2020-01-03 9.21 45040800.0
## 2020-01-06 9.16 43372300.0
Import/Export data

Visual
illustrations ## 2020-01-07 9.25 44984100.0
Matplotlib package ## 2020-01-08 9.25 45994900.0
Figures and subplots
Plot types and styles
Stock code list
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Data access: Yahoo Finance 227
Essential
concepts
Getting started
Procedural Explore Ford dataset
programming
Object-orientation ford.index
Numerical ## DatetimeIndex(['2020-01-02', '2020-01-03',...
programming
NumPy package
## ...dtype='datetime64[ns]', name='Date',...
ford.loc["2020-01-28"]
Array basics
Linear algebra

Data formats and


handling ## High 9.000000e+00
Pandas package ## Low 8.860000e+00
Series
## Open 8.940000e+00
## Close 8.970000e+00
DataFrame
Import/Export data

Visual
## Volume 8.516340e+07
illustrations ## Adj Close 8.820001e+00
Matplotlib package ## Name: 2020-01-28 00:00:00, dtype: float64
Figures and subplots
Plot types and styles
Pandas layers

Applications
DataFrame index
Time series
Moving window
Index of the DataFrame is different at different sources. Always check
Financial applications DataFrame.index!
Optimization

© 2020 PyEcon.org
Data access: Yahoo Finance 228
Essential
concepts
Getting started
Procedural Download and explore SAP data
programming
Object-orientation sap = data.DataReader("SAP", "yahoo", "2020-01-01", "2020-06-30")
Numerical sap[25:27]
programming
NumPy package
Array basics
## High Low ... Volume Adj Close
Linear algebra ## Date ...
Data formats and
## 2020-02-07 136.020004 134.639999 ... 511700.0 133.137955
handling ## 2020-02-10 135.369995 134.679993 ... 381200.0 133.305527
Pandas package
##
## [2 rows x 6 columns]
Series
DataFrame
Import/Export data

Visual
sap.loc["2020-03-09"]
illustrations
Matplotlib package ## High 1.161900e+02
Figures and subplots
## Low 1.105500e+02
Plot types and styles
Pandas layers
## Open 1.136100e+02
Applications
## Close 1.115000e+02
Time series ## Volume 1.571800e+06
Moving window ## Adj Close 1.099132e+02
Financial applications
## Name: 2020-03-09 00:00:00, dtype: float64
Optimization

© 2020 PyEcon.org
Data access: Eurostat 229
Essential
concepts
Getting started
Procedural Eurostat
programming
Object-orientation population = data.DataReader("tps00001", "eurostat", "2010-01-01",
Numerical "2020-01-01")
programming
NumPy package
population.columns
Array basics
Linear algebra
## MultiIndex(levels=[[Population on 1 January - total], [Albania,
## Andorra, Armenia, Austria, Azerbaijan, Belarus, Belgium, ...
Data formats and
handling
population["Population on 1 January - total", "France"][-5:]
Pandas package
Series ## FREQ Annual
## TIME_PERIOD
DataFrame
Import/Export data
## 2016-01-01 66638391.0
Visual
illustrations ## 2017-01-01 66809816.0
Matplotlib package ## 2018-01-01 66918941.0
Figures and subplots
## 2019-01-01 67012883.0
Plot types and styles
Pandas layers
## 2020-01-01 67098824.0
Applications
Time series Eurostat Database
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Read data from HTML 230
Essential
concepts
Getting started Website used for the example: Econometrics
Procedural
programming
Object-orientation Beautiful Soup
Numerical
programming
from bs4 import BeautifulSoup
NumPy package import requests
Array basics url = "www.uni-goettingen.de/de/applied-econometrics/412565.html"
Linear algebra
r = requests.get("https://" + url)
Data formats and
handling
d = r.text
Pandas package soup = BeautifulSoup(d, "lxml")
soup.title
Series
DataFrame

## <title>Applied Econometrics - Georg-August-... ...</title>


Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Reading data from HTML in detail exceeds the content of this course.
Plot types and styles If you are interested in this kind of importing data, you can find detailed
Pandas layers
information on Beautiful Soup here.
Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Motivation 231
Essential
concepts
Getting started
Procedural Bollinger
programming
Object-orientation sap = data.DataReader("SAP", "yahoo", "2019-01-01", "2020-08-31")
Numerical sap.index = pd.to_datetime(sap.index)
programming
boll = sap["Close"].rolling(window=20, center=False).mean()
NumPy package
Array basics
std = sap["Close"].rolling(window=20, center=False).std()
Linear algebra upp = boll + std * 2
Data formats and low = boll - std * 2
handling fig = plt.figure()
ax = fig.add_subplot(1, 1, 1)
Pandas package
Series
DataFrame boll.plot(ax=ax, label="20 days Rolling mean")
Import/Export data upp.plot(ax=ax, label="Upper Band")
Visual low.plot(ax=ax, label="Lower Band")
illustrations
sap["Close"].plot(ax=ax, label="SAP Price")
Matplotlib package
Figures and subplots ax.legend(loc="best")
Plot types and styles fig.savefig("out/boll.pdf")
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Motivation 232
Essential
concepts
Getting started
Procedural
programming 20 days Rolling mean
Object-orientation Upper Band
Lower Band
Numerical 160 SAP Price
programming
NumPy package
Array basics
Linear algebra

Data formats and


140
handling
Pandas package
Series

120
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots 100
Plot types and styles
Pandas layers

Applications
1 3 5 7 9 1 1 3 5 7 9
9-0 019-0 019-0 019-0 019-0 019-1 020-0 020-0 020-0 020-0 020-0
Time series
Moving window
201 2 2 2 2 2 2 2 2 2 2
Financial applications Date
Optimization

© 2020 PyEcon.org
Chapter 4 233
Essential
concepts
Getting started
Procedural
programming
Object-orientation
Visual illustrations
Numerical
programming
NumPy package
Array basics
4.1 Matplotlib package
Linear algebra

Data formats and


4.2 Figures and subplots
handling
Pandas package 4.3 Plot types and styles
Series
DataFrame
Import/Export data
4.4 Pandas layers
Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Section 4.1 234
Essential
concepts
Getting started
Procedural
programming
Object-orientation
Visual illustrations
Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


I Matplotlib package
handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
matplotlib 235
Essential
concepts
Getting started
Procedural
programming
Object-orientation

Numerical
programming
NumPy package
Array basics

The package matplotlib is a free software library for python including


Linear algebra

Data formats and


handling the following functions:
Pandas package
Series Image plots, Contour plots, Scatter plots, Polar plots, Line plots,
DataFrame
Import/Export data 3D plots,
Visual
illustrations Variety of hardcopy formats,
Matplotlib package
Figures and subplots
Works in Python scripts, the Python and IPython shell and the
Plot types and styles
Jupyter notebook,
Pandas layers

Applications Interactive environments.


Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
matplotlib 236
Essential
concepts
Getting started
Procedural
Usage of matplotlib
programming
Object-orientation matplotlib has a vast number of functions and options, which is hard
Numerical
programming
to remember. But for almost every task there is an example you can
NumPy package take code from. A great source of information is the examples gallery
on the matplotlib homepage. Also note the best practice quick start
Array basics
Linear algebra

Data formats and guide.


handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Simple plot 237
Essential
concepts
Getting started plt.plot(array): Plots the values of a list, the X-axis has by default
Procedural
programming the range [0, 1, ..., n-1].
Object-orientation

Numerical
programming
Import matplotlib and simple example
NumPy package import matplotlib.pyplot as plt
Array basics
Linear algebra
import numpy as np
plt.plot(np.arange(10))
Data formats and
handling plt.savefig("out/list.pdf")
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations 8

Matplotlib package
6
Figures and subplots
Plot types and styles
4
Pandas layers

Applications 2

Time series
Moving window 0
0 2 4 6 8
Financial applications
Optimization

© 2020 PyEcon.org
Section 4.2 238
Essential
concepts
Getting started
Procedural
programming
Object-orientation
Visual illustrations
Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


I Figures and subplots
handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Figures 239
Essential
concepts
Getting started Plots in matplotlib reside in a Figure object:
Procedural
programming plt.figure(...): Creates new Figure object allowing for multiple
Object-orientation
parameters.
Numerical
programming plt.gcf(): Returns the reference of the active figure.
NumPy package
Array basics
Linear algebra
Create Figures
Data formats and fig = plt.figure(figsize=(16, 8))
handling
Pandas package
print(plt.gcf())
Series
DataFrame ## Figure(1600x800)
Import/Export data

Visual
illustrations
Matplotlib package A Figure object can be considered as an empty window,
Figures and subplots
Plot types and styles The Figure object has a number of options, such as the size or
Pandas layers

Applications
the aspect ratio,
Time series
Moving window
You cannot draw a plot in a blank figure. There has to be a
Financial applications subplot in the Figure object.
Optimization

© 2020 PyEcon.org
Saving plots to file 240
Essential
concepts
Getting started plt.savefig("filename"): Saves active figure to file.
Procedural
programming Available file formats are among others:
Object-orientation

Numerical
programming Filename extension Description
NumPy package
Array basics .png Portable Network Graphics
Linear algebra
.pdf Portable Document Format
Data formats and
handling .svg Scalable Vector Graphics
Pandas package
Series
.jpeg JPEG File Interchange Format
DataFrame
Import/Export data
.jpg JPEG File Interchange Format
Visual .ps PostScript
illustrations
Matplotlib package
.raw Raw Image Format
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Subplots 241
Essential
concepts
Getting started fig.add_subplot(): Adds subplot to the Figure fig.
Procedural
programming Example: fig.add_subplot(2, 2, 1) creates four subplots and se-
Object-orientation
lects the first.
Numerical
programming
NumPy package
Adding subplots
Array basics
Linear algebra
ax1 = fig.add_subplot(2, 2, 1)
Data formats and
ax2 = fig.add_subplot(2, 2, 2)
handling ax3 = fig.add_subplot(2, 2, 3)
Pandas package
ax4 = fig.add_subplot(2, 2, 4)
fig.savefig("out/subplots.pdf")
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
The Figure object is filled with subplots in which the plots reside,
Using the plt.plot() command without creating a subplot in
Figures and subplots
Plot types and styles
Pandas layers
advance, matplotlib will create a Figure object and a subplot
Applications
Time series
automatically,
Moving window
Financial applications
The Figure object and its subplots can be created in one line.
Optimization

© 2020 PyEcon.org
Subplots 242
Essential
concepts
Getting started
Procedural
programming
Object-orientation
1.0 1.0
Numerical 0.8 0.8
programming
NumPy package 0.6 0.6

Array basics 0.4 0.4


Linear algebra
0.2 0.2
Data formats and
0.0 0.0
handling 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
Pandas package 1.0 1.0
Series
0.8 0.8
DataFrame
Import/Export data 0.6 0.6

Visual 0.4 0.4


illustrations
0.2 0.2
Matplotlib package
0.0 0.0
Figures and subplots 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Subplots 243
Essential
concepts
Getting started
Procedural Filling subplots with content
programming
Object-orientation from numpy.random import randn
Numerical ax1.plot([5, 7, 4, 3, 1])
programming ax2.hist(randn(100), bins=20, color="r")
ax3.scatter(np.arange(30), np.arange(30) * randn(30))
NumPy package
Array basics
Linear algebra ax4.plot(randn(40), "k--")
Data formats and fig.savefig("out/content.pdf")
handling
Pandas package
Series
DataFrame The subplots in one Figure object can be filled with different plot
Import/Export data

Visual
types,
illustrations
Matplotlib package
Using only plt.plot() matplotlib draws the plot in the last
Figures and subplots Figure object and last subplot selected.
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Subplots 244
Essential
concepts
Getting started
Procedural
programming
Object-orientation
7 14
Numerical 6 12
programming 10
5
NumPy package 8
4
Array basics 6
3
Linear algebra 4
2
2
Data formats and 1
0
handling 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 2 1 0 1 2
Pandas package
2
Series 40
DataFrame 20 1
Import/Export data
0 0
Visual 20
illustrations 1
40
Matplotlib package 2
Figures and subplots 0 5 10 15 20 25 30 0 5 10 15 20 25 30 35 40
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Standard creation of plots 245
Essential
concepts
Getting started plt.subplots(nrows, ncols, sharex, sharey): Creates figure and
Procedural
programming subplots in one line. If sharex or sharey are True, all subplots share
Object-orientation
the same X- or Y-ticks.
Numerical
programming
NumPy package
Standard creation
Array basics fig, axes = plt.subplots(2, 3, figsize=(16, 8), sharey=True)
Linear algebra
axes[1, 1].plot(np.arange(7), color="r")
Data formats and
handling
axes[0, 2].plot(np.arange(10, 0, -1))
Pandas package fig.savefig("out/standard.pdf")
Series
DataFrame
Import/Export data

Visual 10
illustrations 8
Matplotlib package 6

Figures and subplots 4

Plot types and styles 2

0
Pandas layers 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0 2 4 6 8

10
Applications
8
Time series
6
Moving window
4
Financial applications 2
Optimization 0
0.0 0.2 0.4 0.6 0.8 1.0 0 1 2 3 4 5 6 0.0 0.2 0.4 0.6 0.8 1.0

© 2020 PyEcon.org
Section 4.3 246
Essential
concepts
Getting started
Procedural
programming
Object-orientation
Visual illustrations
Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


I Plot types and styles
handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Plot types 247
Essential
concepts
Getting started ax.scatter(x, y): Creates a scatter plot of x vs y.
Procedural
programming ax.hist(x, bins): Creates a histogram.
Object-orientation
ax.fill_between(x, y, a): Creates a plot of x vs y and fills plot
Numerical
programming between a and y.
NumPy package
Array basics
Linear algebra
Types
Data formats and fig, ax = plt.subplots(1, 3, figsize=(16, 8))
handling
Pandas package
ax[0].hist([1, 2, 3, 4, 5, 4, 3, 2, 3, 4, 2, 3, 4, 4],
Series bins=5, color="yellow")
DataFrame x = np.arange(0, 10, 0.1)
Import/Export data
y = np.sin(x)
Visual
illustrations
ax[1].fill_between(x, y, 0, color="green")
Matplotlib package ax[2].scatter(x, y)
Figures and subplots fig.savefig("out/types.pdf")
Plot types and styles
Pandas layers

Applications A vast number of plot types can be found in the examples gallery.
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Plot types 248
Essential
concepts
Getting started
Procedural
programming
Object-orientation
5 1.00 1.00
Numerical
programming
0.75 0.75
NumPy package
4
Array basics 0.50 0.50
Linear algebra
0.25 0.25
Data formats and 3
handling
0.00 0.00
Pandas package
Series 2 0.25 0.25
DataFrame
Import/Export data 0.50 0.50

1
Visual 0.75 0.75
illustrations
Matplotlib package 1.00 1.00
0
Figures and subplots 1 2 3 4 5 0 2 4 6 8 10 0 2 4 6 8 10
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Adjusting the spacing around subplots 249
Essential
concepts
Getting started plt.subplots_adjust(left, bottom, ..., hspace): Sets the space
Procedural
programming between the subplots. wspace and hspace control the percentage of
Object-orientation
the figure width and figure height, respectively, to use as spacing
Numerical
programming between subplots.
NumPy package
Array basics
Linear algebra
Adjust spacing
Data formats and fig, axes = plt.subplots(2, 2, sharex=True, sharey=True)
handling
Pandas package
for i in range(2):
Series for j in range(2):
DataFrame axes[i][j].plot(randn(10))
Import/Export data
plt.subplots_adjust(wspace=0, hspace=0)
Visual
illustrations
fig.savefig("out/spacing.pdf")
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Adjusting the spacing around subplots 250
Essential
concepts
Getting started
Procedural
programming
Object-orientation

Numerical 3
programming
NumPy package 2
Array basics
Linear algebra 1
Data formats and
handling 0
Pandas package
Series 1
DataFrame
Import/Export data
3
Visual
illustrations 2
Matplotlib package
Figures and subplots 1
Plot types and styles
Pandas layers
0
Applications
Time series 1
Moving window
Financial applications
0 2 4 6 8 0 2 4 6 8
Optimization

© 2020 PyEcon.org
Colors, markers and line styles 251
Essential
concepts
Getting started ax.plot(data, linestyle, color, marker): Sets data and styles
Procedural
programming of subplot ax.
Object-orientation

Numerical
programming
Styles
NumPy package
fig, ax = plt.subplots(1, figsize=(15, 6))
ax.plot(randn(10), linestyle="--", color="darkcyan", marker="p")
Array basics
Linear algebra

Data formats and


fig.savefig("out/style.pdf")
handling
Pandas package
Series
DataFrame
Import/Export data
2.0
Visual
illustrations 1.5
Matplotlib package
Figures and subplots 1.0

Plot types and styles


0.5
Pandas layers
0.0
Applications
Time series 0.5
Moving window
1.0
Financial applications
0 2 4 6 8
Optimization

© 2020 PyEcon.org
Plot colors 252
Essential
concepts
Getting started
Procedural
programming
Object-orientation

Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Plot line styles 253
Essential
concepts
Getting started
Procedural
programming
Object-orientation

Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Plot markers 254
Essential
concepts
Getting started
Procedural
Marker Description
programming
Object-orientation
"." point
Numerical "," pixel
programming
NumPy package
"o" circle
Array basics "v" triangle_down
Linear algebra

Data formats and


"8" octagon
handling
Pandas package
"s" square
Series "p" pentagon
DataFrame
Import/Export data "P" plus (filled)
Visual "*" star
illustrations
Matplotlib package "h" hexagon1
Figures and subplots
Plot types and styles
"H" hexagon2
Pandas layers "+" plus
Applications
Time series
"x" x
Moving window "X" x (filled)
Financial applications
Optimization "D" diamond

© 2020 PyEcon.org
Ticks and labels 255
Essential
concepts
Getting started ax.set_xticks(): Sets list of X-ticks, analogously for Y-axis.
Procedural
programming ax.set_xlabel(): Sets the X-label.
Object-orientation
ax.set_title(): Sets the subplot title.
Numerical
programming
NumPy package Ticks and labels - default
Array basics
Linear algebra
fig, ax = plt.subplots(1, figsize=(15, 10))
Data formats and
ax.plot(randn(1000).cumsum())
handling fig.savefig("out/withoutlabels.pdf")
Pandas package
Series
DataFrame
Import/Export data
Here, we create a Figure object as well as a subplot and fill it
Visual
illustrations with a line plot of a random walk,
Matplotlib package
Figures and subplots By default matplotlib places the ticks evenly distributed along the
Plot types and styles
Pandas layers
data range. Individual ticks can be set as follows,
Applications By default there is no axis label or title.
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Ticks and labels 256
Essential
concepts
Getting started
Procedural
programming
Object-orientation

Numerical
programming
NumPy package 60
Array basics
Linear algebra

Data formats and


handling
40
Pandas package
Series
DataFrame
Import/Export data
20
Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles 0
Pandas layers

Applications
Time series
0 200 400 600 800 1000
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Ticks and labels 257
Essential
concepts
Getting started
Procedural Set ticks and labels
programming
Object-orientation ax.set_xticks([0, 250, 500, 750, 1000])
Numerical ax.set_xlabel("Days", fontsize=20)
programming ax.set_ylabel("Change", fontsize=20)
NumPy package
ax.set_title("Simulation", fontsize=30)
Array basics
Linear algebra
fig.savefig("out/labels.pdf")
Data formats and
handling

The individual ticks are given as a list to ax.set_xticks(),


Pandas package
Series
DataFrame
Import/Export data The label and titel can be set to an individual size using the
Visual argument fontsize.
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Ticks and labels 258
Essential
concepts
Getting started

Simulation
Procedural
programming
Object-orientation

Numerical
programming
NumPy package 60
Array basics
Linear algebra

Data formats and


handling
40
Pandas package
Change

Series
DataFrame
Import/Export data
20
Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles 0
Pandas layers

Applications
Time series
0 250 500 750 1000
Moving window Days
Financial applications
Optimization

© 2020 PyEcon.org
Legends 259
Essential
concepts
Getting started Using multiple plots in one subplot one needs a legend.
Procedural
programming ax.legend(loc): Shows the legend at location loc.
Object-orientation
Some options: "best", "upper right", "center left", ...
Numerical
programming
NumPy package Set legend
Array basics
Linear algebra fig = plt.figure(figsize=(15, 10))
Data formats and ax = fig.add_subplot(1, 1, 1)
handling ax.plot(randn(1000).cumsum(), label="first")
ax.plot(randn(1000).cumsum(), label="second")
Pandas package
Series
DataFrame ax.plot(randn(1000).cumsum(), label="third")
Import/Export data ax.legend(loc="best", fontsize=20)
Visual fig.savefig("out/legend.pdf")
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers
The legend displays the label and the color of the associated plot,
Applications Using the option "best" the legend will placed in a corner where
Time series
Moving window is does not interfere the plots.
Financial applications
Optimization

© 2020 PyEcon.org
Legends 260
Essential
concepts
Getting started
Procedural
programming
Object-orientation
80
Numerical
first
programming second
NumPy package 60 third
Array basics
Linear algebra
40
Data formats and
handling
Pandas package 20
Series
DataFrame
Import/Export data 0

Visual
illustrations
20
Matplotlib package
Figures and subplots
Plot types and styles 40
Pandas layers

Applications
60
Time series
0 200 400 600 800 1000
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Annotations on a subplot 261
Essential
concepts
Getting started ax.text(x, y, "text", fontsize): Inserts a text into a subplot.
Procedural
programming ax.annotate("text", xy, xytext, arrwoprops): Inserts an ar-
Object-orientation
row with annotations.
Numerical
programming
NumPy package
Annotations
Array basics
ax.text(400, -30, "here", fontsize=50)
Linear algebra
ax.annotate("there",
Data formats and
handling fontsize=40,
Pandas package xy=(0, 0),
Series
xytext=(400, 8),
arrowprops=dict(facecolor="black",
DataFrame
Import/Export data

Visual
shrink=0.05))
illustrations ax.set_yticks([-40, -30, -20, -10, 0, 10, 20, 30, 40])
Matplotlib package fig.savefig("out/arrow.pdf")
Figures and subplots
Plot types and styles
Pandas layers

Applications Using ax.annotate() the arrow head points at xy and the


bottom left corner of the text will be placed at xytext.
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Annotations 262
Essential
concepts
Getting started
Procedural
programming
Object-orientation

Numerical
first
programming second
NumPy package third
Array basics
Linear algebra
40
Data formats and
handling 30

there
Pandas package 20
Series
10
DataFrame
Import/Export data 0

Visual 10

here
illustrations
20
Matplotlib package
Figures and subplots 30
Plot types and styles 40
Pandas layers

Applications
Time series
0 200 400 600 800 1000
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Annotations 263
Essential
concepts
Getting started
Procedural Annotation Lehman
programming
Object-orientation import pandas as pd
Numerical
from datetime import datetime
programming
NumPy package
date = datetime(2008, 9, 15)
fig = plt.figure(figsize=(16, 8))
Array basics
Linear algebra

Data formats and


ax = fig.add_subplot(1, 1, 1)
handling dow = pd.read_csv("data/dji.csv", index_col=0, parse_dates=True)
Pandas package close = dow["Close"]
Series
DataFrame
close.plot(ax=ax)
Import/Export data ax.annotate("Lehman Bankruptcy",
Visual
fontsize=30,
illustrations xy=(date, close.loc[date] + 400),
Matplotlib package
xytext=(date, 22000),
Figures and subplots
Plot types and styles
arrowprops=dict(facecolor="red",
Pandas layers shrink=0.03))
Applications ax.set_title("Dow Jones Industrial Average", size=40)
Time series fig.savefig("out/lehman.pdf")
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Annotations 264
Essential
concepts
Getting started

Dow Jones Industrial Average


Procedural
programming
Object-orientation
27500
Numerical
programming 25000

NumPy package
Array basics
22500 Lehman Bankruptcy
Linear algebra 20000

Data formats and 17500


handling
15000
Pandas package
Series 12500
DataFrame
10000
Import/Export data
7500
Visual
illustrations
6 8 0 2 4 6 8
200 200 201 201 201 201 201
Matplotlib package Date
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Drawing on a subplot 265
Essential
concepts
Getting started plt.Rectangle((x, y), width, height, angle): Creates a rect-
Procedural
programming angle
Object-orientation
plt.Circle((x,y), radius): Creates a circle.
Numerical
programming
NumPy package Drawing
Array basics
Linear algebra fig = plt.figure(figsize=(6, 6))
Data formats and ax = fig.add_subplot(1, 1, 1)
handling ax.set_xticks([0, 1, 2, 3, 4, 5])
ax.set_yticks([0, 1, 2, 3, 4, 5])
Pandas package
Series
DataFrame rectangle = plt.Rectangle((1.5, 1),
Import/Export data width=0.8, height=2,
Visual color="red", angle=30)
illustrations
Matplotlib package
circ = plt.Circle((3, 3),
Figures and subplots radius=1, color="blue")
Plot types and styles ax.add_patch(rectangle)
Pandas layers
ax.add_patch(circ)
Applications fig.savefig("out/draw.pdf")
Time series
Moving window
Financial applications A list of all available patches can be found here: matplotlib-patches
Optimization

© 2020 PyEcon.org
Drawing on a subplot 266
Essential
concepts
Getting started
Procedural
programming
Object-orientation 5
Numerical
programming
NumPy package
Array basics 4
Linear algebra

Data formats and


handling
Pandas package 3
Series
DataFrame
Import/Export data

Visual 2
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
1
Pandas layers

Applications
Time series
Moving window 0
Financial applications 0 1 2 3 4 5
Optimization

© 2020 PyEcon.org
Best practice: Visual illustrations 267
Essential
concepts
Getting started Step 1
Procedural
programming Create a Figure object and subplots
Object-orientation

Numerical
programming
Best practice Step 1
NumPy package
fig, ax = plt.subplots(1, 1, figsize=(16, 8))
Array basics
Linear algebra

Data formats and Step 2


handling
Pandas package Plot data using different plot types
Series
DataFrame
An overview of plot types can be found in the examples gallery.
Import/Export data

Visual
Best practice Step 2
illustrations
Matplotlib package
x = np.arange(0, 10, 0.1)
Figures and subplots y = np.sin(x)
Plot types and styles
ax.scatter(x, y)
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Best practice: Visual illustrations 268
Essential
concepts
Getting started
Procedural
programming
Object-orientation
1.00
Numerical
programming
0.75
NumPy package
Array basics 0.50
Linear algebra
0.25
Data formats and
handling
0.00
Pandas package
Series 0.25
DataFrame
Import/Export data 0.50

Visual 0.75
illustrations
Matplotlib package 1.00
Figures and subplots 0 2 4 6 8 10
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Best practice: Visual illustrations 269
Essential
concepts
Getting started Step 3
Procedural
programming Set colors, markers and line styles
Object-orientation

Numerical
programming
Best practice Step 3
NumPy package
ax.scatter(x, y, color="green", marker="s")
Array basics
Linear algebra

Data formats and Step 4


handling
Pandas package Set title, axis labels and ticks
Series
DataFrame Best practice Step 4
Import/Export data

Visual ax.set_title("Sine wave", fontsize=30)


illustrations ax.set_xticks([0, 2.5, 5, 7.5, 10])
ax.set_yticks([-1, 0, 1])
Matplotlib package
Figures and subplots
Plot types and styles ax.set_ylabel("y-value", fontsize=20)
Pandas layers ax.set_xlabel("x-value", fontsize=20)
Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Best practice: Visual illustrations 270
Essential
concepts
Getting started
Procedural

Sine wave
programming
Object-orientation
1
Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


y-value

handling
0
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package 1
Figures and subplots 0.0 2.5 5.0 7.5 10.0
Plot types and styles x-value
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Best practice: Visual illustrations 271
Essential
concepts
Getting started Step 5
Procedural
programming Set labels
Object-orientation

Numerical Best practice Step 5


programming
NumPy package ax.scatter(x, y, color="green", marker="s", label="Sine")
Array basics
Linear algebra

Data formats and Step 6


handling
Pandas package
Set legend (if you add another plot to an existing figure)
Series
DataFrame Best practice Step 6
Import/Export data

Visual ax.plot(np.arange(11) / 10, color="blue", linestyle="-",


illustrations label="Linear")
Matplotlib package
Figures and subplots
ax.legend(fontsize=20)
Plot types and styles
Pandas layers
Step 7
Applications
Time series
Save plot to file
Moving window
Financial applications
Optimization
Best practice Step 7
fig.savefig("out/sinewave.pdf")
© 2020 PyEcon.org
Best practice: Visual illustrations 272
Essential
concepts
Getting started
Procedural

Sine wave
programming
Object-orientation
1
Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


y-value

handling
0
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations Linear
Matplotlib package 1 Sine
Figures and subplots 0.0 2.5 5.0 7.5 10.0
Plot types and styles x-value
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Section 4.4 273
Essential
concepts
Getting started
Procedural
programming
Object-orientation
Visual illustrations
Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


I Pandas layers
handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Plotting with layers 274
Essential
concepts
Getting started Plotting with matplotlib is often tedious and requires some research:
Procedural
programming You need to recall parameter details to create a professional charts. For
Object-orientation
recurring, everyday tasks, you might prefer another level of abstraction:
Numerical
programming Layer frameworks, which operate on top of matplotlib, produce pretty
NumPy package
Array basics
looking results with short methods and less code. The most popular
Linear algebra packages are:
Data formats and
handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots pandas provides a convenient layer with frequently demanded
Plot types and styles
Pandas layers plotting methods for its objects, such as Series and DataFrames.
Applications
Time series
Seaborn is a powerful graphics framework that allows you to easily
Moving window create beautiful, complex graphics using a simple interface.
Financial applications
Optimization
→ In this section, we will have a look at pandas’ integrated layer
methods. However, Seaborn also works very well with pandas objects.
© 2020 PyEcon.org
Line plots 275
Essential
concepts
Getting started DataFrame/Series.plot(): Plots a DataFrame or a Series.
Procedural
programming
Object-orientation Simple line plot
Numerical
programming plt.close("all")
NumPy package p = pd.Series(np.random.rand(10).cumsum(),
Array basics
index=np.arange(0, 1000, 100))
p
Linear algebra

Data formats and


handling
Pandas package
## 0 0.669761
Series ## 100 0.989702
DataFrame
## 200 1.655715
## 300 1.966073
Import/Export data

Visual
illustrations
## 400 2.151883
Matplotlib package ## 500 2.776987
Figures and subplots ## 600 2.839751
Plot types and styles
Pandas layers
## 700 3.188431
## 800 4.169061
Applications
Time series
## 900 4.923286
Moving window ## dtype: float64
Financial applications
Optimization p.plot()
plt.savefig("out/line.pdf")
© 2020 PyEcon.org
Line plots 276
Essential
concepts
Getting started
Procedural
programming
Object-orientation

Numerical 5
programming
NumPy package
Array basics
Linear algebra 4
Data formats and
handling
Pandas package
3
Series
DataFrame
Import/Export data

Visual 2
illustrations
Matplotlib package
Figures and subplots
Plot types and styles 1
Pandas layers

Applications 0 200 400 600 800


Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Line plots 277
Essential
concepts
Getting started
Procedural Line plots
programming
Object-orientation df = pd.DataFrame(np.random.randn(10, 3), index=np.arange(10),
Numerical columns=["a", "b", "c"])
programming
df
NumPy package
Array basics
Linear algebra ## a b c
Data formats and
## 0 1.703615 -1.376905 -1.336154
handling ## 1 -1.402924 0.812501 1.739143
Pandas package
## 2 0.593504 0.699582 0.423217
## 3 1.140647 -1.454363 0.250578
Series
DataFrame
Import/Export data ## 4 -0.044809 0.438279 -0.821514
Visual ## 5 1.897959 -0.254581 0.157704
illustrations ## 6 0.782639 1.196116 0.763081
Matplotlib package
Figures and subplots
## 7 0.577947 1.815039 1.175842
Plot types and styles ## 8 -0.278585 -0.538956 0.102930
Pandas layers ## 9 -0.091891 0.310788 -0.857167
Applications
Time series df.plot(figsize=(15, 12))
Moving window
plt.savefig("out/line2.pdf")
Financial applications
Optimization

© 2020 PyEcon.org
Line plots 278
Essential
concepts
Getting started
Procedural
2.0 a
programming b
Object-orientation c

Numerical
1.5
programming
NumPy package
Array basics
Linear algebra 1.0

Data formats and


handling
Pandas package 0.5
Series
DataFrame
Import/Export data 0.0
Visual
illustrations
Matplotlib package
0.5
Figures and subplots
Plot types and styles
Pandas layers
1.0
Applications
Time series
Moving window
1.5
Financial applications
0 2 4 6 8
Optimization

© 2020 PyEcon.org
Plotting and pandas 279
Essential
concepts
Getting started The plot method applied to a DataFrame plots each column as a
Procedural
programming different line and shows the legend automatically. Plotting DataFrames,
Object-orientation
there are serveral arguments to change the style of the plot:
Numerical
programming
NumPy package
Array basics Argument Description
Linear algebra
kind "line", "bar", etc
Data formats and
handling logy logarithmic scale on Y-axis
Pandas package
Series
use_index If True, use index for tick labels
DataFrame
rot Rotation of tick labels
Import/Export data

Visual
xticks Values for x ticks
illustrations
Matplotlib package
yticks Values for y ticks
Figures and subplots grid Set grid True or False
Plot types and styles
Pandas layers xlim X-axis limits
Applications ylim Y-axis limits
Time series
Moving window
subplots Plot each DataFrame column in a new subplot
Financial applications
Optimization Table: Pandas plot arguments

© 2020 PyEcon.org
Pandas plot 280
Essential
concepts
Getting started
Procedural
Separated line plots
programming
Object-orientation df.plot(grid=True, rot=45, subplots=True, title="Example",
Numerical figsize=(15, 10))
programming plt.savefig("out/pandas.pdf")
NumPy package
Array basics
Linear algebra
Example
Data formats and
handling
2
a
Pandas package
1
Series
0
DataFrame
Import/Export data 1

Visual b
illustrations 1
Matplotlib package
0
Figures and subplots
1
Plot types and styles
Pandas layers
1.5 c
Applications 1.0
0.5
Time series 0.0
0.5
Moving window
1.0
Financial applications
0

8
Optimization

© 2020 PyEcon.org
Standard creation of plots and pandas 281
Essential
concepts
Getting started dataframe.plot(ax=subplot): Plots a dataframe into subplot.
Procedural
programming
Object-orientation Standard creation
Numerical
programming
fig = plt.figure(figsize=(6, 6))
NumPy package ax = fig.add_subplot(1, 1, 1)
Array basics guests = np.array([[1334, 456], [1243, 597], [1477, 505],
Linear algebra
[1502, 404], [854, 512], [682, 0]])
Data formats and canteen = pd.DataFrame(guests,
handling
Pandas package
index=["Mon", "Tue", "Wed",
Series "Thu", "Fri", "Sat"],
DataFrame columns=["Zentral", "Turm"])
Import/Export data
canteen
Visual
illustrations
Matplotlib package
## Zentral Turm
Figures and subplots ## Mon 1334 456
Plot types and styles ## Tue 1243 597
Pandas layers
## Wed 1477 505
Applications
## Thu 1502 404
## Fri 854 512
Time series
Moving window
Financial applications ## Sat 682 0
Optimization

© 2020 PyEcon.org
Standard creation of plots and pandas 282
Essential
concepts
Getting started
Procedural Bar plot
programming
Object-orientation canteen.plot(ax=ax, kind="bar")
Numerical ax.set_ylabel("guests", fontsize=20)
programming
ax.set_title("Canteen use in Göttingen", fontsize=20)
NumPy package
Array basics
fig.savefig("out/canteen.pdf")
Linear algebra

Data formats and

The bar plot resides in the subplot ax,


handling
Pandas package
Series
DataFrame The label and title are set as shown before without using pandas.
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Bar plot 283
Essential
concepts
Getting started
Procedural
programming
Object-orientation
Canteen use in Göttingen
Numerical Zentral
programming Turm
NumPy package
1400
Array basics
Linear algebra 1200
Data formats and
handling 1000
guests
Pandas package
Series
800
DataFrame
Import/Export data
600
Visual
illustrations
Matplotlib package
400
Figures and subplots
Plot types and styles 200
Pandas layers

Applications 0
Tue

Fri
Wed
Mon

Thu

Sat
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Bar plot 284
Essential
concepts
Getting started
Procedural Bar plot - stacked
programming
Object-orientation canteen.plot(ax=ax, kind="bar", stacked=True)
Numerical ax.set_ylabel("guests", fontsize=20)
programming
ax.set_title("Canteen use in Göttingen", fontsize=20)
NumPy package
Array basics
fig.savefig("out/canteenstacked.pdf")
Linear algebra

Data formats and


handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Bar plot 285
Essential
concepts
Getting started
Procedural
programming
Object-orientation
Canteen use in Göttingen
Numerical 2000 Zentral
programming Turm
NumPy package Zentral
1750 Turm
Array basics
Linear algebra
1500
Data formats and
handling
1250
guests
Pandas package
Series
DataFrame 1000
Import/Export data

Visual
750
illustrations
Matplotlib package 500
Figures and subplots
Plot types and styles 250
Pandas layers

Applications 0
Tue

Fri
Wed
Mon

Thu

Sat
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Plot financial data 286
Essential
concepts
Getting started
Procedural BTC chart
programming
Object-orientation fig = plt.figure(figsize=(16, 8))
Numerical ax = fig.add_subplot(1, 1, 1)
programming ax.set_ylabel("price", fontsize=20)
NumPy package
ax.set_xlabel("Date", fontsize=20)
Array basics
Linear algebra
BTC = pd.read_csv("data/btc-eur.csv", index_col=0, parse_dates=True)
Data formats and
BTCclose = BTC["Close"]
handling BTCclose.plot(ax=ax)
Pandas package
ax.set_title("BTC-EUR", fontsize=20)
fig.savefig("out/btc.pdf")
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Plot financial data 287
Essential
concepts
Getting started
Procedural
programming
Object-orientation BTC-EUR
Numerical
15000
programming
NumPy package
12500
Array basics
Linear algebra
10000
price

Data formats and


handling 7500
Pandas package
Series 5000

DataFrame
Import/Export data
2500

Visual 0
illustrations
2 3 4 5 6 7 8 9
201 201 201 201 201 201 201 201
Matplotlib package
Date
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Plot financial data 288
Essential
concepts
Getting started
Procedural Compare - bad illustration
programming
Object-orientation amazon = pd.read_csv("data/amzn.csv", index_col=0,
Numerical parse_dates=True)["Close"]
programming
siemens = pd.read_csv("data/sie.de.csv", index_col=0,
NumPy package
Array basics
parse_dates=True)["Close"]
Linear algebra fig = plt.figure(figsize=(16, 8))
Data formats and ax = fig.add_subplot(1, 1, 1)
handling ax.set_ylabel("price")
amazon.plot(ax=ax, label="Amazon")
Pandas package
Series
DataFrame siemens.plot(ax=ax, label="Siemens")
Import/Export data ax.legend(loc="best")
Visual fig.savefig("out/compare.pdf")
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers
In this illustration you can hardly compare the trend of the two
Applications stocks,
Time series
Moving window
Using pandas you can standardize both dataframes in one line.
Financial applications
Optimization

© 2020 PyEcon.org
Plot financial data 289
Essential
concepts
Getting started
Procedural
programming
Object-orientation
Amazon
Numerical Siemens
1400
programming
NumPy package 1200
Array basics
Linear algebra 1000

Data formats and


price

800
handling
Pandas package 600
Series
DataFrame 400
Import/Export data
200
Visual
illustrations
3 5 7 9 1 1 3
7-0 7-0 7-0 7-0 7-1 8-0 8-0
Matplotlib package 201 201 201 201 201 201 201
Date
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Plot financial data 290
Essential
concepts
Getting started
Procedural Compare - good illustration
programming
Object-orientation amazon = amazon / amazon[0] * 100
Numerical siemens = siemens / siemens[0] * 100
programming
fig = plt.figure(figsize=(16, 8))
NumPy package
Array basics
ax = fig.add_subplot(1, 1, 1)
Linear algebra ax.set_ylabel("percentage")
Data formats and amazon.plot(ax=ax, label="Amazon")
handling siemens.plot(ax=ax, label="Siemens")
ax.legend(loc="best")
Pandas package
Series
DataFrame fig.savefig("out/comparenew.pdf")
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Plot financial data 291
Essential
concepts
Getting started
Procedural
programming
Object-orientation
Amazon
Numerical Siemens
programming
160
NumPy package
Array basics
Linear algebra
140
percentage

Data formats and


handling
Pandas package 120
Series
DataFrame
Import/Export data 100

Visual
illustrations
3 5 7 9 1 1 3
7-0 7-0 7-0 7-0 7-1 8-0 8-0
Matplotlib package 201 201 201 201 201 201 201
Date
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Chapter 5 292
Essential
concepts
Getting started
Procedural
programming
Object-orientation
Applications
Numerical
programming
NumPy package
Array basics
5.1 Time series
Linear algebra

Data formats and


5.2 Moving window
handling
Pandas package 5.3 Financial applications
Series
DataFrame
Import/Export data
5.4 Optimization
Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Section 5.1 293
Essential
concepts
Getting started
Procedural
programming
Object-orientation
Applications
Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


I Time series
handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Date and time data types 294
Essential
concepts
Getting started Data types for date and time are included in the Python standard
Procedural
programming library.
Object-orientation

Numerical
programming
Datetime creation
NumPy package from datetime import datetime
Array basics now = datetime.now()
now
Linear algebra

Data formats and


handling
Pandas package
## datetime.datetime(2020, 10, 31, 14, 12, 39, 24172)
Series
DataFrame now.day
Import/Export data

Visual ## 31
illustrations
Matplotlib package
Figures and subplots
now.hour
Plot types and styles
Pandas layers ## 14
Applications
Time series From datetime you can get the attributes year, month, day, hour,
minute, second, microsecond.
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Set datetime 295
Essential
concepts
Getting started datetime(year, month, day, ..., microsecond): Sets date and
Procedural
programming time.
Object-orientation

Numerical Datetime representation


programming
NumPy package holiday = datetime(2020, 12, 24, 8, 30)
Array basics
holiday
Linear algebra

Data formats and


handling
## datetime.datetime(2020, 12, 24, 8, 30)
Pandas package
Series exam = datetime(2020, 12, 9, 10)
DataFrame print("The exam will be on the " + "{:%Y-%m-%d}".format(exam))
Import/Export data

Visual ## The exam will be on the 2020-12-09


illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Time difference 296
Essential
concepts
Getting started timedelta(days, seconds, microseconds): Represents difference
Procedural
programming between two datetime objects.
Object-orientation

Numerical
programming
Datetime difference
NumPy package from datetime import timedelta
Array basics delta = exam - now
delta
Linear algebra

Data formats and


handling
Pandas package
## datetime.timedelta(days=38, seconds=71240, microseconds=975828)
Series
DataFrame print("The exam will take place in " + str(delta.days) + " days.")
Import/Export data

Visual ## The exam will take place in 38 days.


illustrations
Matplotlib package
Figures and subplots
now
Plot types and styles
Pandas layers ## datetime.datetime(2020, 10, 31, 14, 12, 39, 24172)
Applications
Time series now + timedelta(10, 120)
Moving window
Financial applications ## datetime.datetime(2020, 11, 10, 14, 14, 39, 24172)
Optimization

© 2020 PyEcon.org
Convert string and datetime 297
Essential
concepts
Getting started datetime.strftime("format"): Converts datetime object into string.
Procedural
programming datetime.strptime(datestring, "format"): Converts date as a
Object-orientation
string into a datetime object.
Numerical
programming
NumPy package Convert Datetime
Array basics
Linear algebra
stamp = datetime(2020, 4, 12)
Data formats and
stamp
handling
Pandas package ## datetime.datetime(2020, 4, 12, 0, 0)
Series
DataFrame
Import/Export data
print("German date format: " + stamp.strftime("%d.%m.%Y"))
Visual
illustrations
## German date format: 12.04.2020
Matplotlib package
Figures and subplots val = "2020-5-5"
Plot types and styles d = datetime.strptime(val, "%Y-%m-%d")
Pandas layers
d
Applications
Time series
## datetime.datetime(2020, 5, 5, 0, 0)
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Convert string and datetime 298
Essential
concepts
Getting started
Procedural Converting examples
programming
Object-orientation val = "31.01.2012"
Numerical d = datetime.strptime(val, "%d.%m.%Y")
programming
d
NumPy package
Array basics
Linear algebra
## datetime.datetime(2012, 1, 31, 0, 0)
Data formats and
handling now.strftime("Today is %A and we are in week %W of the year %Y.")
Pandas package
Series ## 'Today is Saturday and we are in week 43 of the year 2020.'
DataFrame

now.strftime("%c")
Import/Export data

Visual
illustrations
Matplotlib package
## 'Sat 31 Oct 2020 02:12:39 PM '
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Overview: Datetime formats 299
Essential
concepts
Getting started
Procedural
programming Type Description
Object-orientation
%Y 4-digit year
Numerical
programming %m 2-digit month [01, 12]
NumPy package
Array basics %d 2-digit day [01, 31]
Linear algebra
%H Hour (24-hour clock) [00, 23]
Data formats and
handling %I Hour (12-hour clock) [01, 12]
Pandas package
Series
%M 2-digit minute [00, 59]
DataFrame %S Second [00, 61]
Import/Export data

Visual
%W Week number of the year [00, 53]
illustrations
Matplotlib package
%F Shortcut for %Y-%m-%d
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Overview : Datetime formats 300
Essential
concepts
Getting started
Procedural
programming Type Description
Object-orientation
%a Abbreviated weekday name
Numerical
programming %A Full weekday name
NumPy package
Array basics %b Abbreviated month name
Linear algebra
%B Full month name
Data formats and
handling %c Full date and time
Pandas package
Series
%x Locale-appropriate formatted date
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Generating date ranges with pandas 301
Essential
concepts
Getting started pd.date_range(start, end, freq): Generates a date range.
Procedural
programming
Object-orientation Date ranges
Numerical
programming import pandas as pd
NumPy package index = pd.date_range("2020-01-01", now)
Array basics index[0:2]
Linear algebra
index[15:16]
Data formats and
handling
index = pd.date_range("2020-01-01", now, freq="M")
Pandas package index[0:2]
## DatetimeIndex(['2020-01-01', '2...ype='datetime64[ns]', freq='D')
Series
DataFrame
Import/Export data ## DatetimeIndex(['2020-01-16'], dtype='datetime64[ns]', freq='D')
Visual ## DatetimeIndex(['2020-01-31', '2...ype='datetime64[ns]', freq='M')
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Overview: Time series frequencies 302
Essential
concepts
Getting started
Procedural
programming Alias Offset type
Object-orientation
D Day
Numerical
programming B Business day
NumPy package
Array basics H Hour
Linear algebra
T Minute
Data formats and
handling S Second
Pandas package
Series
M Month end
DataFrame BM Business month end
Import/Export data

Visual
Q-JAN, Q-FEB, ... Quarter end
illustrations
Matplotlib package
A-JAN, A-FEB, ... Year end
Figures and subplots AS-JAN, AS-FEB, ... Year begin
Plot types and styles
Pandas layers
BA-JAN, BA-FEB, ... Business year end
Applications BAS-JAN, BAS-FEB, ... Business year begin
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Resample date ranges 303
Essential
concepts
Getting started DataFrame.resample("frequency"): Resamples time series by a
Procedural
programming specified frequency.
Object-orientation

Numerical
programming
Resample date ranges
NumPy package import numpy as np
Array basics
Linear algebra
start = datetime(2016, 1, 1)
ind = pd.date_range(start, now)
Data formats and
handling numbers = np.arange((now - start).days + 1)
Pandas package df = pd.DataFrame(numbers, index=ind)
Series
DataFrame
Import/Export data

Visual
illustrations

df.head() df.resample("3BM").sum().head()
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers ## 0 ## 0
Applications ## 2016-01-01 0 ## 2016-01-29 406
Time series ## 2016-01-02 1 ## 2016-04-29 6734
Moving window
## 2016-01-03 2 ## 2016-07-29 15015
Financial applications
Optimization
## 2016-01-04 3 ## 2016-10-31 24205
## 2016-01-05 4 ## 2017-01-31 32246

© 2020 PyEcon.org
Section 5.2 304
Essential
concepts
Getting started
Procedural
programming
Object-orientation
Applications
Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


I Moving window
handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Moving window functions 305
Essential
concepts
Getting started DataFrame.rolling(window): Conducts rolling window computa-
Procedural
programming tions.
Object-orientation

Numerical Rolling mean


programming
NumPy package import matplotlib.pyplot as plt
Array basics amazon = pd.read_csv("data/amzn.csv", index_col=0,
parse_dates=True)["Adj Close"]
Linear algebra

Data formats and


handling
fig = plt.figure(figsize=(16, 8))
Pandas package ax = fig.add_subplot(1, 1, 1)
Series ax.set_ylabel("price")
DataFrame
amazon.plot(ax=ax, label="Amazon")
Import/Export data
amazon.rolling(window=20).mean().plot(ax=ax, label="Rolling mean")
Visual
illustrations ax.legend(loc="best")
Matplotlib package ax.set_title("Amazon price and rolling mean", fontsize=25)
Figures and subplots
fig.savefig("out/amzn.pdf")
Plot types and styles
Pandas layers

Applications Frequently used rolling functions: mean(), median(), sum(), var(),


Time series
Moving window
std(), min(), max().
Financial applications
Optimization

© 2020 PyEcon.org
Moving window functions 306
Essential
concepts
Getting started
Procedural
programming
Object-orientation
1500
Amazon price and rolling mean
Amazon
Numerical Rolling mean
programming 1400
NumPy package
Array basics 1300
Linear algebra
1200
Data formats and
price

handling
1100
Pandas package
Series
1000
DataFrame
Import/Export data
900
Visual
illustrations
7-03 7-0
5
7-0
7
7-0
9
7-1
1
8-0
1
8-0
3
Matplotlib package 201 201 201 201 201 201 201
Date
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Moving window functions 307
Essential
concepts
Getting started
Procedural Standard deviation
programming
Object-orientation fig = plt.figure(figsize=(16, 8))
Numerical ax = fig.add_subplot(1, 1, 1)
programming pfizer = pd.read_csv("data/pfe.csv", index_col=0,
NumPy package
parse_dates=True)["Adj Close"]
Array basics
Linear algebra
pg = pd.read_csv("data/pg.csv", index_col=0,
Data formats and
parse_dates=True)["Adj Close"]
handling prices = pd.DataFrame(index=amazon.index)
Pandas package
prices["amazon"] = pd.DataFrame(amazon)
prices["pfizer"] = pd.DataFrame(pfizer)
Series
DataFrame
Import/Export data prices["pg"] = pd.DataFrame(pg)
Visual prices_std = prices.rolling(window=20).std()
illustrations prices_std.plot(ax=ax)
Matplotlib package
Figures and subplots
ax.set_title("Standard deviation", fontsize=25)
Plot types and styles fig.savefig("out/std.pdf")
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Moving window functions 308
Essential
concepts
Getting started
Procedural
programming
Object-orientation Standard deviation
amazon
Numerical pfizer
70 pg
programming
NumPy package 60
Array basics
50
Linear algebra

Data formats and 40


handling
30
Pandas package
Series
20
DataFrame
Import/Export data 10

Visual 0
illustrations
5 7 9 1 1 3
7-0 7-0 7-0 7-1 8-0 8-0
Matplotlib package 201 201 201 201 201 201
Date
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Moving window functions 309
Essential
concepts
Getting started
Procedural Logarithmic standard deviation
programming
Object-orientation fig = plt.figure(figsize=(16, 8))
Numerical ax = fig.add_subplot(1, 1, 1)
programming
prices_std.plot(ax=ax, logy=True)
NumPy package
Array basics
ax.set_title("Logarithmic standard deviation", fontsize=25)
Linear algebra fig.savefig("out/std_log.pdf")
Data formats and
handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Moving window functions 310
Essential
concepts
Getting started
Procedural
programming
Object-orientation
102
Logarithmic standard deviation
amazon
Numerical pfizer
pg
programming
NumPy package
Array basics
101
Linear algebra

Data formats and


handling
Pandas package
Series 100

DataFrame
Import/Export data

Visual
illustrations
5 7 9 1 1 3
7-0 7-0 7-0 7-1 8-0 8-0
Matplotlib package 201 201 201 201 201 201
Date
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Exponentially weighted functions 311
Essential
concepts
Getting started DataFrame.ewm(span): Computes exponentially weighted rolling win-
Procedural
programming dow functions.
Object-orientation

Numerical Exponentially weighted functions


programming
NumPy package fig = plt.figure(figsize=(16, 8))
Array basics
ax = fig.add_subplot(1, 1, 1)
amazon.rolling(window=40).mean().plot(ax=ax, label="Rolling mean")
Linear algebra

Data formats and


handling amazon.ewm(span=40).mean().plot(ax=ax, label="Exp mean",
Pandas package linestyle="--", color="red")
Series amazon.plot(ax=ax, label="Amazon price")
DataFrame
Import/Export data
ax.legend(loc="best")
ax.set_title("Exponentially weighted functions", fontsize=25)
Visual
illustrations fig.savefig("out/mean.pdf")
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Exponentially weighted functions 312
Essential
concepts
Getting started
Procedural
programming
Object-orientation
1500
Exponentially weighted functions
Rolling mean
Numerical Exp mean
Amazon price
programming 1400
NumPy package
Array basics 1300
Linear algebra
1200
Data formats and
handling
1100
Pandas package
Series
1000
DataFrame
Import/Export data
900
Visual
illustrations
7-03 7-0
5
7-0
7
7-0
9
7-1
1
8-0
1
8-0
3
Matplotlib package 201 201 201 201 201 201 201
Date
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Binary moving window functions 313
Essential
concepts
Getting started DataFrame.pct_change(): Computes the percentage changes per
Procedural
programming period.
Object-orientation

Numerical
programming
Percentage change
NumPy package
fig = plt.figure(figsize=(16, 8))
ax = fig.add_subplot(1, 1, 1)
Array basics
Linear algebra

Data formats and


returns = prices.pct_change()
handling returns.head()
Pandas package
Series
## amazon pfizer pg
## Date
DataFrame
Import/Export data
## 2017-02-23 NaN NaN NaN
Visual
illustrations ## 2017-02-24 -0.008155 0.005872 -0.000878
Matplotlib package ## 2017-02-27 0.004023 0.000584 -0.001757
Figures and subplots
Plot types and styles
## 2017-02-28 -0.004242 -0.004668 0.001980
Pandas layers ## 2017-03-01 0.009514 0.008792 0.006479
Applications
Time series
returns.plot(ax=ax)
Moving window ax.set_title("Returns", fontsize=25)
Financial applications fig.savefig("out/returns.pdf")
Optimization

© 2020 PyEcon.org
Binary moving window functions 314
Essential
concepts
Getting started
Procedural
programming
Object-orientation Returns
amazon
Numerical 0.125 pfizer
pg
programming
NumPy package 0.100

Array basics
0.075
Linear algebra
0.050
Data formats and
handling
0.025
Pandas package
Series 0.000
DataFrame
Import/Export data 0.025

Visual 0.050
illustrations
3 5 7 9 1 1 3
7-0 7-0 7-0 7-0 7-1 8-0 8-0
Matplotlib package 201 201 201 201 201 201 201
Date
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Binary moving window functions 315
Essential
concepts
Getting started DataFrame.rolling().corr(benchmark): Computes correlation be-
Procedural
programming tween two time series.
Object-orientation

Numerical Correlation
programming
NumPy package fig = plt.figure(figsize=(16, 8))
Array basics ax = fig.add_subplot(1, 1, 1)
Linear algebra
DJI = pd.read_csv("data/dji.csv", index_col=0,
Data formats and parse_dates=True)["Adj Close"]
handling
Pandas package
DJI_ret = DJI.pct_change()
Series corr = returns.rolling(window=20).corr(DJI_ret)
DataFrame corr.plot(ax=ax)
ax.grid()
Import/Export data

Visual
illustrations
ax.set_title("20 days correlation", fontsize=25)
Matplotlib package fig.savefig("out/corr.pdf")
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Binary moving window functions 316
Essential
concepts
Getting started
Procedural
programming
Object-orientation 20 days correlation
Numerical 0.8
programming
NumPy package 0.6
Array basics
Linear algebra 0.4

Data formats and


handling 0.2

Pandas package
0.0
Series
DataFrame
0.2
Import/Export data
amazon
Visual pfizer
0.4 pg
illustrations
5 7 9 1 1 3
7-0 7-0 7-0 7-1 8-0 8-0
Matplotlib package 201 201 201 201 201 201
Date
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Section 5.3 317
Essential
concepts
Getting started
Procedural
programming
Object-orientation
Applications
Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


I Financial applications
handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Cumulative returns 318
Essential
concepts
Getting started
Procedural Returns
programming
Object-orientation fig = plt.figure(figsize=(16, 8))
Numerical ax = fig.add_subplot(1, 1, 1)
programming ret_index = (1 + returns).cumprod()
NumPy package
stocks = ["amazon", "pfizer", "pg"]
Array basics
Linear algebra
for i in stocks:
Data formats and
ret_index[i][0] = 1
handling ret_index.tail()
Pandas package
Series
## amazon pfizer pg
DataFrame
Import/Export data
## Date
Visual
## 2018-02-15 1.715298 1.088693 0.932322
illustrations ## 2018-02-16 1.699961 1.105461 0.934471
Matplotlib package ## 2018-02-20 1.723031 1.097840 0.920217
## 2018-02-21 1.740128 1.090218 0.907772
Figures and subplots
Plot types and styles
Pandas layers ## 2018-02-22 1.742968 1.090218 0.914560
Applications
Time series ret_index.plot(ax=ax)
Moving window ax.set_title("Cumulative returns", fontsize=25)
Financial applications
fig.savefig("out/cumret.pdf")
Optimization

© 2020 PyEcon.org
Cumulative returns 319
Essential
concepts
Getting started
Procedural
programming
Object-orientation Cumulative returns
amazon
Numerical pfizer
pg
programming
NumPy package 1.6

Array basics
Linear algebra
1.4
Data formats and
handling
Pandas package 1.2
Series
DataFrame
Import/Export data 1.0

Visual
illustrations
7-03 7-0
5
7-0
7
7-0
9
7-1
1
8-0
1
8-0
3
Matplotlib package 201 201 201 201 201 201 201
Date
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Cumulative returns 320
Essential
concepts
Getting started
Procedural Monthly returns
programming
Object-orientation returns_m = ret_index.resample("BM").last().pct_change()
Numerical returns_m.head()
programming
NumPy package
Array basics
## amazon pfizer pg
Linear algebra ## Date
Data formats and
## 2017-02-28 NaN NaN NaN
handling ## 2017-03-31 0.049110 0.002638 -0.013396
Pandas package
## 2017-04-28 0.043371 -0.008477 -0.020604
## 2017-05-31 0.075276 -0.028124 0.008703
Series
DataFrame
Import/Export data ## 2017-06-30 -0.026764 0.028790 -0.010671
Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Volatility calculation 321
Essential
concepts
Getting started
Procedural Volatility
programming
Object-orientation fig = plt.figure(figsize=(16, 8))
Numerical ax = fig.add_subplot(1, 1, 1)
programming
vola = returns.rolling(window=20).std() * np.sqrt(20)
NumPy package
Array basics
vola.plot(ax=ax)
Linear algebra ax.set_title("Volatility", fontsize=25)
Data formats and fig.savefig("out/vola.pdf")
handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Volatility calculation 322
Essential
concepts
Getting started
Procedural
programming
Object-orientation Volatility
0.14 amazon
Numerical pfizer
pg
programming
0.12
NumPy package
Array basics
0.10
Linear algebra

Data formats and 0.08


handling
Pandas package 0.06
Series
DataFrame 0.04
Import/Export data
0.02
Visual
illustrations
5 7 9 1 1 3
7-0 7-0 7-0 7-1 8-0 8-0
Matplotlib package 201 201 201 201 201 201
Date
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Group analysis 323
Essential
concepts
Getting started DataFrame.describe(): Shows a statistical summary.
Procedural
programming
Object-orientation Describe
Numerical
programming
prices.describe()
NumPy package
Array basics ## amazon pfizer pg
Linear algebra ## count 252.000000 251.000000 252.000000
Data formats and ## mean 1044.521903 33.892665 87.934304
handling
## std 158.041844 1.694680 2.728659
Pandas package
Series
## min 843.200012 30.872143 79.919998
DataFrame ## 25% 953.567474 32.593733 86.241475
Import/Export data
## 50% 988.680023 33.147469 87.863598
Visual ## 75% 1136.952484 35.331834 90.363035
illustrations
Matplotlib package
## max 1485.339966 38.661823 92.988976
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Return analysis 324
Essential
concepts
Getting started
Procedural Histogram
programming
Object-orientation fig, ax = plt.subplots(3, 1, figsize=(10, 8), sharex=True)
Numerical for i in range(3):
programming
ax[i].set_title(stocks[i])
NumPy package
Array basics
returns[stocks[i]].hist(ax=ax[i], bins=50)
Linear algebra fig.savefig("out/return_hist.pdf")
Data formats and
handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Return analysis 325
Essential
concepts
Getting started
Procedural amazon
programming
40
Object-orientation

Numerical
30
programming
20
NumPy package
Array basics 10
Linear algebra
0
Data formats and pfizer
handling
40
Pandas package
Series 30
DataFrame
Import/Export data 20
Visual 10
illustrations
Matplotlib package 0
pg
Figures and subplots
30
Plot types and styles
Pandas layers
20
Applications
Time series
10
Moving window
Financial applications
Optimization 0
0.050 0.025 0.000 0.025 0.050 0.075 0.100 0.125

© 2020 PyEcon.org
Ordinary Least Squares 326
Essential
concepts
Getting started Using the statsmodels module to determine regressions:
Procedural
programming Series.tolist(): Returns a list containing the DataFrame values.
Object-orientation
sm.OLS(Y, X).fit(): Computes OLS fit of data (X, Y).
Numerical
programming
NumPy package Regression data
Array basics
Linear algebra import statsmodels.api as sm
Data formats and
handling fig = plt.figure(figsize=(16, 8))
Pandas package
Series
ax = fig.add_subplot(1, 1, 1)
DataFrame Y = np.array(amazon.loc["2018-1-1":"2018-1-15"].tolist())
Import/Export data X = np.arange(len(Y))
Visual ax.scatter(x=X, y=Y, marker="o", color="red")
illustrations
fig.savefig("out/reg_data.pdf")
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Ordinary Least Squares 327
Essential
concepts
Getting started
Procedural
programming
Object-orientation

Numerical 1300
programming
NumPy package
Array basics 1280

Linear algebra

1260
Data formats and
handling
Pandas package 1240
Series
DataFrame
1220
Import/Export data

Visual
illustrations 1200

Matplotlib package
Figures and subplots 0 1 2 3 4 5 6 7 8
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Ordinary Least Squares 328
Essential
concepts
Getting started
Procedural Regression
programming
Object-orientation X_reg = sm.add_constant(X)
Numerical res = sm.OLS(Y, X_reg).fit()
programming
b, a = res.params
NumPy package
Array basics
ax.plot(X, a * X + b)
Linear algebra fig.savefig("out/ols.pdf")
Data formats and
handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Ordinary Least Squares 329
Essential
concepts
Getting started Summary of OLS regression. To print in python use res.summary().
Procedural
programming
Object-orientation

Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Ordinary Least Squares 330
Essential
concepts
Getting started
Procedural
programming
Object-orientation

Numerical 1300
programming
NumPy package
1280
Array basics
Linear algebra
1260
Data formats and
handling
Pandas package 1240
Series
DataFrame
1220
Import/Export data

Visual
1200
illustrations
Matplotlib package
Figures and subplots 0 1 2 3 4 5 6 7 8
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Section 5.4 331
Essential
concepts
Getting started
Procedural
programming
Object-orientation
Applications
Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


I Optimization
handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Newton-Raphson 332
Essential
concepts
Getting started The Newton-Raphson method is an algorithm for finding successively
Procedural
programming better approximations to the roots of real-valued functions.
Object-orientation

Numerical
programming
Let F : Rk → Rk be a continuously differentiable function and JF (xn )
NumPy package the Jacobian matrix of F . The recursive Newton-Raphson method to
Array basics
Linear algebra find the root of F is given by:
Data formats and

x n+1 := x n − J(x n )−1 F (x n )


handling

Pandas package
Series
DataFrame
Import/Export data
with an initial guess x0 .
Visual
illustrations For f : R → R the process is repeated as
Matplotlib package
Figures and subplots
Plot types and styles f (xn )
xn+1 = xn − .
f 0 (xn )
Pandas layers

Applications
Time series
Moving window Accordingly, we can determine the optimum of the function f by
Financial applications
Optimization
applying the method instead to f 0 = df /dx .

© 2020 PyEcon.org
Newton-Raphson 333
Essential
concepts
Getting started As an illustrative application, we consider the function
Procedural
programming
Object-orientation
f (x ) = 3x 3 + 3x 2 − 5x , x ∈ R,
Numerical
programming which is represented by the blue line in the following diagram. The
NumPy package
Array basics
figure depicts the iterative solution path applying the Newton-Raphson
Linear algebra method to find the root, e.g., x solving f (x ) = 0, by tangent points
Data formats and
handling
and tangents starting from the initial guess x0 = −1.
Pandas package
Series
15.0 f(x)
DataFrame
Import/Export data
12.5
Visual
illustrations
10.0
Matplotlib package
Figures and subplots
7.5
Plot types and styles
Pandas layers
5.0
Applications
Time series 2.5
Moving window
Financial applications
0.0
Optimization x0 x3 x2 x1
1.5 1.0 0.5 0.0 0.5 1.0 1.5

© 2020 PyEcon.org
Newton-Raphson implementation 334
Essential
concepts
Getting started The first step involves the definition of the function f (x ) and its
Procedural
programming derivation f 0 (x ) in Python:
Object-orientation

Numerical
programming
Newton-Raphson requirements
NumPy package
def f(x):
Array basics
Linear algebra
return 3 * x**3 + 3 * x**2 - 5 * x
Data formats and
handling
Pandas package
def df(x):
return 9 * x**2 + 6 * x - 5
Series
DataFrame
Import/Export data

Visual Finally, we implement the Newton-Raphson algorithm as outlined above.


illustrations
Matplotlib package We allow for a (small) absolute deviation between the target function
Figures and subplots
Plot types and styles
and its target value, i.e., 0. In addition, for a better understanding,
Pandas layers we plot the solution path using the tangent points for x0 , x1 , . . . , xN .
Applications The solution point is colored black. Hence, the lines starting with
ax.scatter() are not part of the algorithm – they take global variables
Time series
Moving window
Financial applications
Optimization
and are included just for the visual illustration.

© 2020 PyEcon.org
Newton-Raphson implementation 335
Essential
concepts
Getting started
Procedural Newton-Raphson
programming
Object-orientation def newton_raphson(fun, dfun, x0, e):
Numerical delta = abs(fun(x0))
programming
while delta > e:
NumPy package
Array basics
ax.scatter(x0, f(x0), color="red", s=80)
Linear algebra x0 = x0 - fun(x0) / dfun(x0)
Data formats and delta = abs(fun(x0))
handling ax.scatter(x0, f(x0), color="black", s=80)
Pandas package
Series
return(x0)
DataFrame
Import/Export data fig = plt.figure(figsize=(16, 8))
Visual ax = fig.add_subplot(1, 1, 1)
illustrations
x = np.arange(-1.5, 1.7, 0.001)
Matplotlib package
Figures and subplots ax.plot(x, f(x))
Plot types and styles ax.grid()
Pandas layers
x_root = newton_raphson(f, df, -1, 0.1)
Applications fig.savefig("out/newton_raphson_root.pdf")
Time series
print(f"Root at: {x_root:.4f}")
Moving window
Financial applications
Optimization ## Root at: 0.8878

© 2020 PyEcon.org
Newton-Raphson implementation 336
Essential
concepts
Getting started
Procedural
programming
Object-orientation

Numerical 14
programming
NumPy package 12
Array basics
Linear algebra 10

Data formats and 8


handling
Pandas package 6
Series
4
DataFrame
Import/Export data
2
Visual
illustrations 0

Matplotlib package
2
Figures and subplots 1.5 1.0 0.5 0.0 0.5 1.0 1.5
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Newton-Raphson optimization 337
Essential
concepts
Getting started With the definition of the second derivative f 00 , i.e., the derivative of the
Procedural
programming derivative, we can employ the Newton-Raphson method to obtain an
Object-orientation
optimum of the target function f (x ) numerically. Hence, the previous
Numerical
programming example needs only minimal modifications:
NumPy package
Array basics
Linear algebra
Newton-Raphson
Data formats and def ddf(x):
handling
Pandas package
return 18 * x + 6
Series
DataFrame fig = plt.figure(figsize=(16, 8))
Import/Export data
ax = fig.add_subplot(1, 1, 1)
Visual
illustrations
x = np.arange(-1.5, 1.7, 0.001)
Matplotlib package ax.plot(x, f(x))
Figures and subplots ax.grid()
Plot types and styles
x_opt = newton_raphson(df, ddf, 1, 0.1)
fig.savefig("out/newton_raphson_optimum.pdf")
Pandas layers

Applications
Time series
print(f"Minimum at: {x_opt:.4f}")
Moving window
Financial applications ## Minimum at: 0.4886
Optimization

© 2020 PyEcon.org
Newton-Raphson optimization 338
Essential
concepts
Getting started
Procedural
programming
Object-orientation

Numerical 14
programming
NumPy package 12
Array basics
Linear algebra 10

Data formats and 8


handling
Pandas package 6
Series
4
DataFrame
Import/Export data
2
Visual
illustrations 0

Matplotlib package
2
Figures and subplots 1.5 1.0 0.5 0.0 0.5 1.0 1.5
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
Optimization with SciPy 339
Essential
concepts
Getting started
Procedural
programming
Object-orientation

Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


handling The scipy.optimize package provides several optimization algorithms.
Pandas package
Series A detailed list of the functions is available here.
DataFrame
Import/Export data The module contains:
Visual
illustrations Unconstrained and constrained minimization of multivariate scalar
Matplotlib package
Figures and subplots
functions using a variety of algorithms.
Plot types and styles
Pandas layers
Global optimization routines.
Applications Least-squares minimization and curve fitting algorithms.
Time series
Moving window Scalar univariate functions minimizers and root finders.
Financial applications
Optimization
Multivariate equation system solvers.

© 2020 PyEcon.org
The minimize function 340
Essential
concepts
Getting started scipy.optimize.minimize(): Provides minimization algorithms for
Procedural
programming multivariate scalar functions.
Object-orientation

Numerical Import minimize


programming
NumPy package from scipy.optimize import minimize
Array basics
Linear algebra

Data formats and


The minimize() functions has several parameters, of which the most
handling
Pandas package
important are:
Series
DataFrame
fun: The objective function to be minimized.
Import/Export data
x0: The initial guess.
Visual
illustrations method: The solver algorithm, such as: BFGS, Nelder-Mead
Matplotlib package
Figures and subplots (Simplex), Newton Conjugate Gradient, and many more.
constraints: The constraints definition.
Plot types and styles
Pandas layers

Applications
Time series
A detailed list of all parameters can be found here.
Moving window
Financial applications
Optimization

© 2020 PyEcon.org
1D optimization 341
Essential
concepts
Getting started Let’s get started by finding the minimum of the simple scalar function
Procedural
programming f (x ) = (x − 4)2 + 3 using minimize():
Object-orientation

Numerical
programming
1D optimization using minimize
NumPy package
def f(x):
Array basics
Linear algebra
return (x - 4)**2 + 3
Data formats and
handling x0 = [1] # the initial guess
Pandas package
result = minimize(f, x0)
result
Series
DataFrame
Import/Export data

Visual
## fun: 3.0000000000000036
illustrations ## hess_inv: array([[0.49999999]])
Matplotlib package ## jac: array([-8.94069672e-08])
## message: 'Optimization terminated successfully.'
Figures and subplots
Plot types and styles
Pandas layers ## nfev: 6
Applications ## nit: 2
Time series ## njev: 3
Moving window ## status: 0
Financial applications
Optimization
## success: True
## x: array([3.99999994])

© 2020 PyEcon.org
1D optimization 342
Essential
concepts
Getting started Let’s check if the minimum is correct by plotting the function and
Procedural
programming marking the minima:
Object-orientation

Numerical 1D optimization using minimize


programming
NumPy package min_y = result.fun # get minimum of the function f
Array basics min_x = result.x # get the x value of the minimum
Linear algebra

Data formats and


handling
fig = plt.figure(figsize=(16, 8))
Pandas package ax = fig.add_subplot(1, 1, 1)
Series x = np.arange(1, 7, 0.001)
DataFrame
ax.plot(x, f(x))
Import/Export data
ax.scatter(min_x, min_y, color="red", s=120)
Visual
illustrations fig.savefig("out/minimize_1D.pdf")
Matplotlib package
Figures and subplots
Plot types and styles 12

Pandas layers
10

Applications
8
Time series
Moving window 6

Financial applications
4
Optimization
1 2 3 4 5 6 7

© 2020 PyEcon.org
2D optimization 343
Essential
concepts
Getting started Now, we consider a simple scalar function of two variables
Procedural
programming f (x1 , x2 ) = (x1 − 1)2 + (x2 − 2.5)2 :
Object-orientation

Numerical 2D optimization using minimize


programming
NumPy package def f(x):
Array basics
Linear algebra
return (x[0] - 1)**2 + (x[1] - 2.5)**2
Data formats and
handling x0 = [0, 0] # the initial guess
Pandas package result = minimize(f, x0)
Series
result
DataFrame
Import/Export data
## fun: 1.968344227868139e-15
Visual
illustrations ## hess_inv: array([[ 0.93103448, -0.1724138 ],
Matplotlib package ## [-0.1724138 , 0.56896552]])
Figures and subplots
## jac: array([-6.95567350e-08, 4.21085256e-08])
## message: 'Optimization terminated successfully.'
Plot types and styles
Pandas layers

Applications
## nfev: 9
Time series
## nit: 2
Moving window ## njev: 3
Financial applications
## status: 0
## success: True
Optimization

## x: array([0.99999996, 2.50000001])
© 2020 PyEcon.org
Comparison of solver algorithms 344
Essential
concepts
Getting started A famous performance test problem for optimization algorithms is the
Procedural
programming Rosenbrock function, which is defined by
Object-orientation

Numerical
programming f (x , y ) = (a − x )2 + b(y − x 2 )2 .
NumPy package

It has a global minimum at (x , y ) = (a, a2 ) and the parameters are


Array basics
Linear algebra

Data formats and usually set to a = 1 and b = 100:


handling
Pandas package
Series
Comparison
DataFrame
Import/Export data
def rosen(x):
Visual
return (1 - x[0])**2 + 100 * (x[1] - x[0]**2)**2
illustrations
Matplotlib package x0 = [1.3, 0.4] # random initial guess
Figures and subplots
Plot types and styles
Pandas layers res_1 = minimize(rosen, x0, method="Nelder-Mead")
Applications res_2 = minimize(rosen, x0, method="Powell")
Time series res_3 = minimize(rosen, x0, method="CG")
Moving window res_4 = minimize(rosen, x0, method="BFGS")
Financial applications
Optimization

© 2020 PyEcon.org
Comparison of solver algorithms 345
Essential
concepts
Getting started
Procedural Comparison results
programming
Object-orientation # The perfect solution would be (1, 1)
Numerical
programming
res_1.x
NumPy package
Array basics
Linear algebra ## array([1.00000287, 1.00000496])
Data formats and
handling res_2.x
Pandas package
Series ## array([1., 1.])
DataFrame

res_3.x
Import/Export data

Visual
illustrations
Matplotlib package
## array([0.99999552, 0.99999104])
Figures and subplots
Plot types and styles res_4.x
Pandas layers

Applications ## array([0.99999554, 0.99999108])


Time series
Moving window
Financial applications There is no "best" solver algorithm, it always depends on the problem.
Optimization

© 2020 PyEcon.org
Constrained optimization 346
Essential
concepts
Getting started One key feature of the minimize() function is minimization with
Procedural
programming constraints.
Object-orientation

Numerical Imagine being a producer of tin cans. Your goal is to create a tin can
programming
NumPy package which has a capacity of 500 ml while consuming as little material as
Array basics
Linear algebra
possible. There are two variables which describe the volume v and the
Data formats and
surface s of a tin can: the radius r and the height h.
handling
Pandas package The volume is given by
Series
DataFrame
Import/Export data v (r , h) = π · r 2 · h,
Visual
illustrations
Matplotlib package and the surface can be computed with
Figures and subplots
Plot types and styles
Pandas layers s(r , h) = 2 · π · r · (r + h).
Applications
Time series
Moving window
Your goal is to minimize the function s(r , h) with the constraint
Financial applications v (r , h) = 500.
Optimization

© 2020 PyEcon.org
Constrained optimization 347
Essential
concepts
Getting started We can easily implement the target and constraining function:
Procedural
programming
Object-orientation Tin can optimization
Numerical
programming def s(x):
NumPy package r = x[0]
Array basics
h = x[1]
return 2 * np.pi * r * (r + h)
Linear algebra

Data formats and


handling
Pandas package
Series def v(x):
DataFrame
Import/Export data
r = x[0]
h = x[1]
Visual
illustrations return np.pi * r**2 * h - 500 # as it is compared to zero
Matplotlib package
Figures and subplots
Plot types and styles The constraint is defined in a special dictionary. The type is either
Pandas layers
inequal ("ineq") or equal ("eq") to zero:
Applications
Time series
Moving window
Constraints
Financial applications
con = {"type": "eq", "fun": v}
Optimization

© 2020 PyEcon.org
Constrained optimization result 348
Essential
concepts
Getting started The last step is to set the initial guess and to choose a solver algorithm.
Procedural
programming Only a few solver algorithms work with constraints, one of which is
Object-orientation
abbreviated by "SLSQP":
Numerical
programming
NumPy package Tin can optimization
Array basics
Linear algebra x0 = [1, 1]
Data formats and result = minimize(s, x0, method="SLSQP", constraints=con)
handling result
Pandas package

## fun: 348.73420544452335
Series
DataFrame
Import/Export data ## jac: array([108.10271072, 27.02567291])
Visual ## message: 'Optimization terminated successfully'
illustrations ## nfev: 29
Matplotlib package
Figures and subplots
## nit: 9
Plot types and styles ## njev: 9
Pandas layers ## status: 0
Applications ## success: True
Time series ## x: array([4.30126983, 8.60254111])
Moving window

x = result.x
Financial applications
Optimization

© 2020 PyEcon.org
Constrained optimization result 349
Essential
concepts
Getting started We have found a combination of radius and height which gives us a
Procedural
programming minimal surface of the tin can:
Object-orientation

Numerical Tin can optimization result


programming
NumPy package r, h = x
Array basics r
Linear algebra

Data formats and ## 4.301269826608837


handling
Pandas package
Series h
DataFrame
Import/Export data ## 8.602541108098178
Visual
illustrations np.pi * r**2 * h
Matplotlib package

## 499.9999999820085
Figures and subplots
Plot types and styles
Pandas layers

Applications
s(x)
Time series
Moving window ## 348.73420544452335
Financial applications
Optimization
A tin can with r = 4.3 cm and h = 8.6 cm has a volume of 500 ml and
a minimal material consumption of 348.7 qcm.
© 2020 PyEcon.org
The End... but not finally 350
Essential
concepts
Getting started
Procedural
programming
Object-orientation

Numerical
programming
NumPy package
Array basics
Linear algebra

Data formats and


handling
Pandas package
Series
DataFrame
Import/Export data

Visual
illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers

Applications
Time series
Moving window
Financial applications
Optimization

© 2020 PyEcon.org

You might also like