You are on page 1of 11

9/23/2019 Reasons To Learn Probability for Machine Learning - usm systems - Medium

Reasons To Learn Probability for


Machine Learning
usm systems Follow
Sep 23 · 5 min read

https://medium.com/@usmsystems23/reasons-to-learn-probability-for-machine-learning-a17a7eb56d31 1/11
9/23/2019 Reasons To Learn Probability for Machine Learning - usm systems - Medium

Probability is the field of mathematics that measures uncertainty.

This is a pillar in the field of machine learning, and it is essential to study


before starting. This is misleading advice because the potential makes more
sense to a learner when there is a context of the applied machine learning
process.

In this post, you will find out why machine learning practitioners study the
possibilities to improve their skills and capabilities.

After reading this post, you will know:

Not everyone should learn the potential; It depends on where you are on
your journey of learning machine learning.
Most algorithms are designed using tools and techniques from probability
such as Naive Bayes and Probabilistic Graphical Models.
The maximum likelihood framework that underlies the training of many
machine learning algorithms comes from the field of probability.
Let’s start.

https://medium.com/@usmsystems23/reasons-to-learn-probability-for-machine-learning-a17a7eb56d31 2/11
9/23/2019 Reasons To Learn Probability for Machine Learning - usm systems - Medium

Overview

This tutorial is divided into seven sections; They are:

1. Reasons for not learning probability

2. It is necessary to assess the likelihood of class membership

3. Some algorithms are designed using probability

4. Models are trained using the probabilistic framework

5. Models can be tuned with a probabilistic framework

6. Probability measurements are used to estimate model proficiency

7. One More Reason

Reasons for not learning probability

Before you get into the reasons why you need to learn probability, let’s start
by taking a brief look at the reasons why you shouldn’t.

If you are just starting out with applied machine learning I think you should
not study the possibilities.

https://medium.com/@usmsystems23/reasons-to-learn-probability-for-machine-learning-a17a7eb56d31 3/11
9/23/2019 Reasons To Learn Probability for Machine Learning - usm systems - Medium

It is not necessary.

Some machine learning algorithms do not require an appreciation of the


underlying abstract theory to use machine learning as a tool to solve
problems.

It was slow.

If it takes months and years to study the entire field before starting machine
learning, the model will delay in achieving your goals of working through
attendance modeling issues.

This is a huge field.

Not all probability is related to theoretical machine learning, let alone


applied machine learning.
I recommend the width-first approach to getting started in applied machine
learning.

I call this the results-first approach. You start by learning and practicing the
steps to work out a model attendance modeling problem end-to-end (eg
how to get results) with a tool (such as Skeet-Learn and Pandas in Python).
https://medium.com/@usmsystems23/reasons-to-learn-probability-for-machine-learning-a17a7eb56d31 4/11
9/23/2019 Reasons To Learn Probability for Machine Learning - usm systems - Medium

This process then provides the skeleton and context to gradually deepen
your knowledge, i.e. how the algorithms work and the mathematics that
ultimately counts.

Once you know how to work through the attendance modeling problem,
let’s look at why you can increase your awareness of probability.

1) It is necessary to assess the likelihood of class membership

Classification Predictive Modeling Problems An example is where a given


label is assigned.

One example you know is the Iris Flowers Dataset, where we have four
dimensions of a flower and the goal is to assign one of three different
species of Iris flower to consideration.

We can model the problem by assigning the class label directly to each
observation.

Input: Dimensions of a flower.


Output: an iris species.

https://medium.com/@usmsystems23/reasons-to-learn-probability-for-machine-learning-a17a7eb56d31 5/11
9/23/2019 Reasons To Learn Probability for Machine Learning - usm systems - Medium

A more general approach is to formulate the problem as a potential class


membership, where the probability of each known class being examined is
evaluated.

Input: Dimensions of a flower.


Output: the probability of membership for each iris species.

Formulating the problem as an estimate of class membership simplifies the


modeling problem and makes the model easier to learn. This allows the
model to capture the opacity in the data, which allows the user, such as the
downstream process, to understand the probabilities in the domain context.

By selecting the class with the largest probability, the probabilities can be
transformed into a crisp class label. The probabilities can be scaled or
changed using the probability calibration procedure.

This choice of class membership framing requires a basic understanding of


the likelihood of a modeled prediction problem description.

2) Some algorithms are designed using probability

https://medium.com/@usmsystems23/reasons-to-learn-probability-for-machine-learning-a17a7eb56d31 6/11
9/23/2019 Reasons To Learn Probability for Machine Learning - usm systems - Medium

There are specially designed algorithms to use tools and techniques from
probability.

These are from individual algorithms, such as the Naive Bayes algorithm,
which are built using some simple హ with Bayes theory.

Naive Bayes
It also extends to the whole field of study, called probability graphical
models, often graphical models, or abbreviated PGM, and is based around
Bayes theory.

Probabilistic graphical models


A notable graphical model is Bayesian Belief Networks or Bayesian
Networks, which can capture conditional dependencies between variables.

Bayesian belief networks

3) Models are trained using the probabilistic framework

Many machine learning models are trained using an iterative algorithm


developed under a potential framework.

https://medium.com/@usmsystems23/reasons-to-learn-probability-for-machine-learning-a17a7eb56d31 7/11
9/23/2019 Reasons To Learn Probability for Machine Learning - usm systems - Medium

The framework of maximum likelihood estimation is probably the most


common, sometimes abbreviated as MLE. It is a framework for estimating
the model parameters (eg weights) given the observed data.

It is a framework that represents the general least squares estimation of the


linear regression model.

The expectation-maximization algorithm, or EM for short, is an approach


for maximum likelihood estimation that is often used for unsupervised data
clustering, e.g. Estimating k for k clusters, also known as k -means
clustering algorithm.

For models that estimate class membership, the maximum likelihood


estimation framework provides a way to reduce the difference or difference
between the observed and estimated probability distribution. It is used in
classification algorithms such as logistic regression and deep learning
neural networks.

It is common to measure this difference in the probability distribution


during training using entropy, e.g. By cross entropy. The differences
between the distributions measured by the entropy, and the KL divergence,
and from the cross-entropy information theory are built directly on the
https://medium.com/@usmsystems23/reasons-to-learn-probability-for-machine-learning-a17a7eb56d31 8/11
9/23/2019 Reasons To Learn Probability for Machine Learning - usm systems - Medium

theory of probability. For example, entropy is directly calculated as a


negative log of probability.

4) Models can be tuned with a probabilistic framework

It is common to tune the hyperparameters of the machine learning model,


kNN fork, or the learning rate on the neural network.

Typical methods include grid search ranges of hyperparameters or


randomly coupled hyperparameter combinations.

Bayesian optimization is more efficient for hyperparameter optimization,


which searches the space of possible configurations based on those
configurations that lead to improved performance.

As its name suggests, this approach is designed and uses Bayes theory when
designing the space of possible configurations.

5) Probability measurements are used to estimate model proficiency

For predictive algorithms of probability, evaluation steps are necessary to


capture the model performance.
https://medium.com/@usmsystems23/reasons-to-learn-probability-for-machine-learning-a17a7eb56d31 9/11
9/23/2019 Reasons To Learn Probability for Machine Learning - usm systems - Medium

There are several steps to capture the performance of the model based on
the probability given by Prob. Common examples include overall measures
such as log loss and Briar score.

For binary classification tasks where a single probability score is estimated,


the receiver operating characteristic, or ROC, can construct curves to
explore different cut-offs, which can be used in interpreting the estimation,
resulting in different trade-offs. The area under the ROC curve, or ROC
AUC, can also be calculated as a total measure.

The selection and interpretation of these scoring methods require a


foundational understanding of the theory of probability.

Machine Learning AI Arti cial Intelligence Ai Development Company Ai Development

Discover Medium Make Medium yours Become a member


Welcome to a place where words matter. Follow all the topics you care about, and Get unlimited access to the best stories on
On Medium, smart voices and original we’ll deliver the best stories for you to your Medium — and support writers while
ideas take center stage - with no ads in homepage and inbox. Explore you’re at it. Just $5/month. Upgrade
sight. Watch

https://medium.com/@usmsystems23/reasons-to-learn-probability-for-machine-learning-a17a7eb56d31 10/11
9/23/2019 Reasons To Learn Probability for Machine Learning - usm systems - Medium

About Help Legal

https://medium.com/@usmsystems23/reasons-to-learn-probability-for-machine-learning-a17a7eb56d31 11/11

You might also like