You are on page 1of 15

Session 9 –

Python
Scripting
Delivered by
Dr. Pratyush Banerjee
Session Objectives
• Basic Understanding of Jupyter Notebook
• Basic Understanding of major Python Libraries
• Basic Understanding of Python Script Widget in Orange
• Some applications of Python Scripting
• Connecting Python Scripting with Orange
• Discussing Pros and cons of GUI vs. Scripting
Jupyter Notebook
• Jupyter Notebook has its origins in the developments made through
Project Jupyter, a non-profit organization which is committed to
developing open-source software, open standards, and services for
interactive computing for users.
• Project Jupyter is actually a spin-off from IPython, another Python
environment developed by Fernando Perez.
• The name Jupyter is an amalgamation for the three mother languages
which form the core of the software – Julia, Python and R. The name
also pays homage to the event of the discovery of the moons of the
planet Jupiter which were recorded in a notebook by Galileo.
The Jupyter Notebook Coding Console
Important Python Libraries
• Pandas - Pandas is the most widely used library for importing data-frames
within the Python environment. The name is derived from the term “panel
data”, an econometrics term for multidimensional structured data sets.
• Pandas is an open-source library under the Berkeley Software Distribution
(BSD) license. It was developed by Wes McKinney, and can create data-
frames from CSV / TSV/ SQL database.
• The corresponding syntax to import the Pandas library in Jupyter Notebook
is:

• The syntax (‘as pd’) helps in accessing Pandas with ‘pd. command’, so that
there is no need to write ‘pandas. command’ every time we need to use it.
Important Python Libraries contd…
• Numpy
The NumPy library stands for ‘Numerical Python’. This package is
essential for carrying out high level complex mathematical computing
in Python. The Numpy package supports a wide variety of multi-
dimensional arrays and matrices and has also facilitated deep learning
computations with linear algebra which require such arrays.
• The corresponding syntax to import the Numpy library in Jupyter
Notebook is:
Machine learning Library in Python
• Scikitlearn -Skikit-learn is the machine learning library for Python. This package was developed by
David Cournapeau as a part of his Google summer code project. What started off as a mere
internship, ended up being one of the most comprehensive repositories of essential ML algorithms
ranging from classification and regression to clustering type data mining problems. Scikit-learn,
also known as sk-learn, features several cutting-edge ML algorithms such as Support Vector
Machines, Random Forests, Neural Networks, Gradient Boosting, K-Means clustering and DBSCAN
(Density-Based Spatial Clustering of Applications with Noise).
Some of the most common applications of Scikit-learn are listed below:
• Regression modeling using ridge and lasso techniques
• Unsupervised classification and cluster analysis
• Decision tree analysis
• Neural network-based analysis with regression and classification algorithms
• Decision boundary learning with SVMs
• Feature analysis and selection (feature engineering)
• Dimensionality reduction through Principal Component Analysis
• Outlier detection and rejection through scatterplots
Python Script Widget in orange
• Python Script widget can be used to run a python script in the input,
when a suitable functionality is not implemented in an existing
widget.
• After the script is executed variables from the script’s local
namespace are extracted and used as outputs of the widget. The
widget can be further connected to other widgets for visualizing the
output.
Python Script Widget
Connecting Python Script widget in Orange
Python Script for conducting Regression in Orange
Python Script for conducting Classification in Orange
What can not be done with Orange

• Orange is still an evolving platform


• There are several situations where it will not give you the desired outcome
• Typical situations in statistics –
a) Linear Regression
b) Logistic Regression
c) T Tests and ANOVA
d) Factor Analysis
 Typical situations in machine learning –
a) Recurrent neural networks
b) Long Short Term (LSTM) networks
Thank You

You might also like