You are on page 1of 6

5/25/23, 10:15 AM Yahoo Mail - Is Data Normalization Always Necessary Before Training ML Models?

Is Data Normalization Always Necessary Before Training ML Models?

From: Daily Dose of Data Science (avichawla@substack.com)

To: chidichekwas@yahoo.com

Date: Monday, May 8, 2023 at 05:01 AM CDT

Open in app or online

Is Data Normalization Always


Necessary Before Training ML
Models?
If not, when is not needed?
AVI CHAWLA
MAY 8

SHARE

about:blank 1/6
5/25/23, 10:15 AM Yahoo Mail - Is Data Normalization Always Necessary Before Training ML Models?

Data normalization is commonly used to improve the performance and


stability of ML models.

This is because normalization scales the data to a standard range. This


prevents a specific feature from having a strong influence on the model’s
output. What’s more, it ensures that the model is more robust to variations in
the data.

about:blank 2/6
5/25/23, 10:15 AM Yahoo Mail - Is Data Normalization Always Necessary Before Training ML Models?

Different scales of columns

For instance, in the image above, the scale of Income could massively
impact the overall prediction. Normalizing the data by scaling both features to
the same range can mitigate this and improve the model’s performance.

But is it always necessary?

While normalizing data is crucial in many cases, knowing when to do it is also


equally important.

The following visual depicts which algorithms typically need normalized data
and which don’t.

about:blank 3/6
5/25/23, 10:15 AM Yahoo Mail - Is Data Normalization Always Necessary Before Training ML Models?

Categorization of algorithms based on normalized data requirement

As shown above, many algorithms typically do not need normalized data.


These include decision trees, random forests, naive bayes, gradient boosting,
and more.

Consider a decision tree, for instance. It splits the data based on thresholds
determined solely by the feature values, regardless of their scale.

about:blank 4/6
5/25/23, 10:15 AM Yahoo Mail - Is Data Normalization Always Necessary Before Training ML Models?

Decision tree

Thus, it’s important to understand the nature of your data and the algorithm
you intend to use.

You may never need data normalization if the algorithm is insensitive to the
scale of the data.

Over to you: What other algorithms typically work well without normalizing
data? Let me know :)

👉 Read what others are saying about this post on LinkedIn and Twitter.

👉 If you liked this post, don’t forget to leave a like ❤️. It helps more
people discover this newsletter on Substack and tells me that you
appreciate reading these daily insights. The button is located towards
the bottom of this email.

👉 If you love reading this newsletter, feel free to share it with friends!

Share Daily Dose of Data Science

about:blank 5/6
5/25/23, 10:15 AM Yahoo Mail - Is Data Normalization Always Necessary Before Training ML Models?

Find the code for my tips here: GitHub.

I like to explore, experiment and write about data science concepts and tools.
You can read my articles on Medium. Also, you can connect with me on
LinkedIn and Twitter.

LIKE COMMENT RESTACK

© 2023 Avi Chawla


India
Unsubscribe

about:blank 6/6

You might also like