You are on page 1of 20

BUS265 Machine Learning and

Digital Technology
Lecture 1: Python for Reproducible Machine Learning

Dr Valentin Danchev School of Business and Management


Queen Mary University of London
What is Python?

A scripting language Easy to learn, easy to use,


available on every platform quite extensible and robust

2
What is Python?
• Python is popular both for data science &
general software development
• Mastering the language fundamentals is critical
• Learn through practice:
• See some examples & learn the rules
• Try out variants of those examples yourself
• Write new code that solves new problems

3
Where is Python?
• Free to download and install for all platforms
• Command-line:
- $ python
• IDE (integrated development environment)
- Jupyter notebook
- Colab

4
Python & Jupyter for AI/ML Research
Python
• Python is open source and free programming language
• Python is one of world’s most popular programming language with a growing
community
• Python programming skills are in high demand on the job market
• The Python ecosystem includes fast, powerful, and flexible open source tools for
doing data science and AI/ML, such as Pandas, Seaborn, and scikit-learn
Jupyter Notebook and Colab
• Jupyter Notebook is an open-source web application that allows you to create and
share documents that contain code, equations, visualisations and text
• Supports a wide range of workflows in data science and machine learning
• Colab is a free environment that runs Jupyter notebooks on the Google Cloud and
requires no install or setup.

5
6
User-friendly
interactive
computational
tools

• Prior knowledge of
programming is not
required
• Coding for ML/AI will be
taught from first
principles

7
Review of Python Concepts
• An expression evaluates to a value
• Values can be numbers or strings (text); we’ll see lots
of other kinds of values soon
• The syntax (format) of the language is very rigid —
even an extra space can cause a syntax error
• There is particular behaviour associated with built-in
operators that you need to learn
(e.g., dividing produces 8.0 instead of 8)

8
Names

9
Assignment statement

• An assignment statement assigns the value of an expression to a simple variable


(hours_per_week)
• An assignment statement changes the meaning of the name to the left of the =
symbol
• The name is bound to the value of the expression to the right of the = symbol (its
current value; not the equation)
(Demo)

10
Link to Demo Colab notebook

https://colab.research.google.com/drive/1rX_huniaW1fAem4vVTUd--IofKiiyrMR?usp=sharing

11
Functions

12
What is a function?

• A function is a block of code that:


• takes input parameters
• performs a specific task
• returns an output

13
Anatomy of call expression

14
Anatomy of call expression

15
Review of Function Concepts
• Some functions require a particular number of
arguments (e.g., abs must be called on one value)
• Arguments can be named in the call expression:
round(number=12.34)
But the names must match the documentation
• Type a ? after a function name to see its
documentation

16
Summary
• Python: the most used language for doing data
science
• Simple and easy to learn, yet versatile
• Most common tools for practicing Python: Jupyter,
Anaconda, Colab
• Most useful libraries for data science: numpy, pandas,
matplotlib, sklearn

17
Link to Demo Colab notebook

https://colab.research.google.com/drive/1ZyGjsRQljDVWCAqLNpQsOfC8rkl41TaM?usp=sharing

18
Acknowledgements

• Courses/slides by Foster Provost, Panos Adamopoulos, Karolis Urbonas, Leonid


Zhukov, Mladen Kolar, John Kelleher, Chirag Shah, Data8 at Berkeley

19
Thank you

You might also like