You are on page 1of 39

AutoML

Learn how to use AutoML & AutoDL in


practice!
Who am I?
• Stijn Van Hijfte
• Lecturer AI & Coordinator Postgraduate Intelligent
Automation at Howest Applied University College
• Consultant at Sia Partners Brussels
• Author of several books on topics such as AI,
blockchain, innovation, and more
What are we • Why an automated approach?
going to do • What are the advantages of using an
automated approach?
today? • Introduce some AutoML, AutoDL &
AutoForecast libraries
• Where can we use these libraries?
• What aspects can be automated?
• When should we use these libraries
• Practical examples & tricks
What brought you
here today?

Shortly tell me what you What is your professional


expect to see today background?
Why Python?

RICH COMMUNITY SIMPLE SYNTAX USED ACROSS A WIDE ENCOURAGES VERY STRONG IN DATA
RANGE OF SOLUTIONS AUTOMATION SCIENCE

SECURE …
Machine learning
Deep learning
Forecasting & time series
Traditional Machine Learning
• Traditional machine learning is:
• Time-consuming
• Resource-intensive
• Expensive
• Data-dependent
• Requires expertise
• Stuck in test-phase
• But also requires humans to:
• Preprocess and clean the data.
• Select and construct appropriate features.
• Select an appropriate model family.
• Optimize model hyperparameters.
• Design the topology of neural networks (if deep learning is used).
• Postprocess machine learning models.
• Critically analyse the results obtained.
What is AutoML?
Allows a wider range of professionals to tackle data-intensive tasks

Faster demo of possible results

Optimization of resources

Faster insight in the available data

Helps you to look in the right direction

Insight in the models chosen

Comparing different models

Less expertise for feature selection


• Data pre-processing
What • Data cleaning
• Model training
aspects can • Model evaluation

be • Feature selection
• Model selection

automated? • …
What use cases
do you see in your
own professional
life?
What different
use cases have
you worked on
in the past?
• You want quick results
• Time sensitive
• Little resources

When might • Little expertise


• Quick demo
you want to • Test the possibilities
• Help you in the right direction
use automl? • Test theory
• Convince decision makers
• Supplement human decisions
• …
• Pandas Profiling
• Quick exploratory data analysis
• Useful way to start your AutoML journey
What • Easy to read and share
packages are • However, cannot completely replace the
analysis an experienced data scientist would
there? do
• Sweetviz, Autoviz & D-Tale can also be
used
• Snorkel
What • Classification tasks
• Good for text classification projects
packages are • Incomplete data or lack of target labels
there? • Automatically labels data
• Data augmentation
• H2O.ai
What • Offers a suite of tools that helps with the
entire cycle from data cleaning, to model
packages are evaluation, to deployment
there? • Python and R
• GUI available
• Auto-Sklearn
• Leverages advantages in Bayesian
optimization, meta-learning and ensemble
construction
What • Offers regressors, classifiers & preprocessors
packages are • Works on binary classification, multiclass
classification, multilabel classification,
there? regression and multioutput regression
• Easy to use
• Allows you to inspect the results
• Linux!
What packages are there?
• TPOT
• Tree-based pipeline
optimization tool
• Automates feature
selection, preprocessing
and construction
• Exports your model as a
Python code file
• HyperOpt
What • Bayesian optimization
• Developed to help optimize machine
packages are learning pipelines
there? • Difficult to use directly
• Easier to use HyperOpt-sklearn
• SMAC
What • Sequential model-based algorithm
configuration
packages are • Grid search of multiple models
there? • Evaluating model performance through
standard evaluation metrics
• MLBox
What • Fast reading and distributed data
preprocessing / cleaning / formatting
packages are • Highly robust feature selection and leak
detection
there? • Accurate hyper-parameter optimization in a
high-dimensional space
• AutoKeras
What • Neural architecture search algorithm
• Complex elements such as embeddings and
packages are spatial reductions can be added
• It includes preprocessing, vectorizing and
there? cleaning of text data
• Easy to use
• Ludwig
• Works with audio input, images, text,
numerical, binary data, timeseries, …
• Runs on top of TensorFlow
What • AutoGluon
packages are • Classify data, format vectors, define number
of layers, define model architecture,
there? hyperparameter optimization, …
• AutoGL
• Feature engineering, model training,
hyperparameter optimization, model
ensemble, …
Q&A
Examples & explanation

Example_AutoML_Explanations
Exercise_1
Solution 1
Q&A
Examples & explanation

Example_AutoML_2
Exercise 2
Solution 2
Q&A
• Timeseries analysis is often a difficult task
• Requires a lot of very specific expertise
What is • Single vs multilevel timeseries
• Trend, seasonality and noise
automated • Many different statistical models to try
forecasting? • Many packages which can help you to try out
different approaches
• Automated forecasting can ease the journey
PyAf
AutoForecasting
libraries Auto-TS

AtsPy
Examples &
Explanation

AutoML_Examples_3
Exercise 3
Solution 3
Q&A

You might also like