Professional Documents
Culture Documents
Digital Technology
Lecture 1: Python for Reproducible Machine Learning
2
What is Python?
• Python is popular both for data science &
general software development
• Mastering the language fundamentals is critical
• Learn through practice:
• See some examples & learn the rules
• Try out variants of those examples yourself
• Write new code that solves new problems
3
Where is Python?
• Free to download and install for all platforms
• Command-line:
- $ python
• IDE (integrated development environment)
- Jupyter notebook
- Colab
4
Python & Jupyter for AI/ML Research
Python
• Python is open source and free programming language
• Python is one of world’s most popular programming language with a growing
community
• Python programming skills are in high demand on the job market
• The Python ecosystem includes fast, powerful, and flexible open source tools for
doing data science and AI/ML, such as Pandas, Seaborn, and scikit-learn
Jupyter Notebook and Colab
• Jupyter Notebook is an open-source web application that allows you to create and
share documents that contain code, equations, visualisations and text
• Supports a wide range of workflows in data science and machine learning
• Colab is a free environment that runs Jupyter notebooks on the Google Cloud and
requires no install or setup.
5
6
User-friendly
interactive
computational
tools
• Prior knowledge of
programming is not
required
• Coding for ML/AI will be
taught from first
principles
7
Review of Python Concepts
• An expression evaluates to a value
• Values can be numbers or strings (text); we’ll see lots
of other kinds of values soon
• The syntax (format) of the language is very rigid —
even an extra space can cause a syntax error
• There is particular behaviour associated with built-in
operators that you need to learn
(e.g., dividing produces 8.0 instead of 8)
8
Names
9
Assignment statement
10
Link to Demo Colab notebook
https://colab.research.google.com/drive/1rX_huniaW1fAem4vVTUd--IofKiiyrMR?usp=sharing
11
Functions
12
What is a function?
13
Anatomy of call expression
14
Anatomy of call expression
15
Review of Function Concepts
• Some functions require a particular number of
arguments (e.g., abs must be called on one value)
• Arguments can be named in the call expression:
round(number=12.34)
But the names must match the documentation
• Type a ? after a function name to see its
documentation
16
Summary
• Python: the most used language for doing data
science
• Simple and easy to learn, yet versatile
• Most common tools for practicing Python: Jupyter,
Anaconda, Colab
• Most useful libraries for data science: numpy, pandas,
matplotlib, sklearn
17
Link to Demo Colab notebook
https://colab.research.google.com/drive/1ZyGjsRQljDVWCAqLNpQsOfC8rkl41TaM?usp=sharing
18
Acknowledgements
19
Thank you