Machine Learning

Because learning changes everything.
Chapter 07
Automated Machine Learning
Copyright 2022 © McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill
Education.
What Is Automated Machine Learning (AutoML)?
Recall there are two main types of machine learning methods.

• A supervised model is one with a defined target variable.
• An unsupervised model has no target variable.
Running each supervised technique individually and comparing accuracy
results is too time consuming.
• An efficient alternative is Automated Machine Learning (AutoML).
• This is a supervised approach that explores and selects models using
different algorithms and compares their predictive performance.
Users still must understand the underlying elements involved in
developing the model.
© McGraw-Hill Education 2
What Questions Might Arise?
How was the data collected and What are the reasons behind why
prepared for analysis? the recommended model
How did the model arrive at a produced the most accurate
particular conclusion? decision?
What is the blueprint of the Are there data issues that could
model? be impacting the validity of the
model?
Why did the model arrive at a
particular conclusion? Is the model consistent in its
predictions?
What variables had the greatest
impact on the predicted outcome? Why is the model a good
predictor?
What patterns exist in the data?
How accurate is the model?
AutoML in Marketing
Forty percent of companies report already using machine learning to

improve sales and marketing performance.
The adoption rate for AutoML is expected to increase substantially.
Access text alternative for this image.
Which Companies Are Actively Using AutoML?
• Facebook. Blue Health Intelligence (BHI).

• AirBnB. United Airlines.
• Sumitomo Mitsui Card URBN.
Company (SMCC). Disney.
• Kroger. Pelephone.
• The Philadelphia 76ers. Salesforce Einstein.
What Are Key Steps in the Automated Machine Learning
Process?
There are four key steps.
Preparing the data.
Building models.
Creating ensemble models.
Recommending models.
Data Preparation
Data preparation may include handling:

• Missing data.
• Outliers.
• Variable selection.
• Data transformation.
• Data standardization in order to maintain a common format.
Invalid and unreliable data results in “garbage in, garbage out.”
• Appropriate data preparation is a fundamental first step in producing
accurate model predictions.
Model Building
Many models are built automatically after the analyst

specifies the dependent variable.
The purpose of a model is to extract insights from

data.
AutoML uses pre-established modeling techniques

that create access for anyone from novices to data
science experts.
Creating Ensemble Models
Sometimes the best approach is to combine different algorithms, blending

information from more than one model into a single “super model.”
• This type of model is referred to as an ensemble model.
• This process reduces issues such as noise, bias, and inconsistent or
skewed variance that cause prediction problems.
An ensemble model usually generates the best overall predictive
performance.
• Keep in mind that understanding how different variables have
contributed to an outcome can be difficult.
Simple Approaches to Ensemble Modeling
For continuous target variables, one method is to take the average of

predictions from multiple models.
• You first run each model separately to create two prediction scores.
• Then calculate the average of the two models to create a new
ensemble score.
For categorical
Another more advanced technique involves using target variables,
a weighted average. the most common
• The higher quality data would be assigned category of
greater importance and thus weighted higher. “majority” rule can
be used.
Advanced Ensemble Methods – Bagging
Bagging, short for “Bootstrap Aggregating” involves two main steps.

• Step 1 generates multiple random small samples from the larger
sample.
• Because the observation is not removed from the original sample, only
copied, it can be copied again and placed in a second or third sample.
• This process is referred to as “bootstrap sampling.”
Step 2 is to execute a model on each sample and then combine the

results.
• Combined results are based on taking the average of all samples for
continuous outcomes.
• Or the majority of case results for categorical variables.
Exhibit 7-3: Bagging (Bootstrap Aggregating)

Source: Amey Naik, “Bagging: Machine Learning Through Visuals. #1: What Is ‘Bagging’ Ensemble Learning?” Medium, June 24, 2018, https://medium.com/machine-learning-through-visuals/machine-
© McGraw-Hill Education learning-through-visuals-part-1-what-is-bagging-ensemble-learning-432059568cc8. 12
Advanced Ensemble Methods – Boosting
The objective of boosting is reducing error in the model.

• Boosting achieves this by observing the error records in a model and
then oversampling misclassified records in the next model created.
During the first step, the model is applied to a sample of the data.
• A new sample is drawn that is more likely to select records that were
misclassified in the first model.
Next, the second model is applied to the new sample.
• The steps are repeated multiple times by fitting a model over and over.
The purpose of boosting is to improve performance and reduce
misclassification.
• The final model will have a better prediction performance than any of
the other models.
Exhibit 7-4: Boosting
Model Recommendation
Multiple predictive models are examined and the model with the most
accurate predictions is recommended.
• Accuracy is determined by how well a model identifies relationships
and patterns in a dataset and uses this knowledge to predict outcomes.
• Higher levels of accuracy are measured based on better predictions of
observations, not in the original datasets used to develop the model.
• The most accurate prediction model(s) is then used to make better
decisions.
Case Study – Loan Data: Understanding When and How to
Support Fiscal Responsibility in Customers
Lending Club is a P2P lending platform who wants to reduce their default
rate from 10 to 8 percent within a year.
• A supervised model is needed to identify borrowers with a high chance
of default.
• You will upload data into DataRobot for AutoML analysis.
• After identifying the target variable, you run the model and evaluate
the results before applying the model to predict new cases.
The results of the AutoML revealed 2 customers out of 11 who were more
likely to default on their loan.
• Lending Club can send targeted messages to these customers about
bill paying, penalties, and free access to financial advisors.
End of main content.
Because learning changes everything. ®
www.mheducation.com
Copyright 2022 © McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill
Education.

Machine Learning

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Machine Learning

Uploaded by

Copyright:

Available Formats

Because learning changes everything.

Recall there are two main types of machine learning methods.

Forty percent of companies report already using machine learning to

Access text alternative for this image.

• Facebook. Blue Health Intelligence (BHI).

There are four key steps.

Preparing the data.

Creating ensemble models.

Data preparation may include handling:

Many models are built automatically after the analyst

The purpose of a model is to extract insights from

AutoML uses pre-established modeling techniques

Sometimes the best approach is to combine different algorithms, blending

For continuous target variables, one method is to take the average of

Bagging, short for “Bootstrap Aggregating” involves two main steps.

Step 2 is to execute a model on each sample and then combine the

Access text alternative for this image.

The objective of boosting is reducing error in the model.

Access text alternative for this image.

Because learning changes everything. ®

You might also like