You are on page 1of 1

Sign in Get started

Follow 564K Followers · Editors' Picks Features Explore Grow Contribute About

Three reasons that you should


NOT use deep learning
George Seif Aug 8, 2018 · 4 min read

Tweaking neural network hyperparamters can be tricky business. Even Peter Griffin has trouble with it!

I recently started a book-focused educational newsletter. Book Dives is a bi-


weekly newsletter where for each new issue we dive into a non-:ction book.
You’ll learn about the book’s core lessons and how to apply them in real life.
You can subscribe for it here.

Deep Learning has been the hottest thing in AI for the past several years. In
fact, it’s really what sparked new interest in AI from scientists,
governments, large corporations, and practically everyone else in between!
It really is a very cool science with potentially tremendous practical and
positive applications. It’s being used for >nance, engineering,
entertainment, and consumer products and services.

But should we really be using it everywhere? Whenever we make something


new, should we automatically be going with deep learning?

There’s a few cases where it really isn’t appropriate to use deep learning and
a number of reasons why you would choose to go another route. Let’s
explore them…

(1) It doesn’t work so well with small data


To achieve high performance, deep networks require extremely large
datasets. The more labelled data we have, the better our model performs.
Well-annotated data can be both expensive and time consuming to acquire.
Hiring people to manually collect images and label them is not eIcient at
all. And in the deep learning era, data is very well arguably your most
valuable resource.

The networks achieving high performance in the latest research are often
trained on hundreds of thousands and even millions of samples. For many
applications, such large datasets are not readily available and will be
expensive and time consuming to acquire. For smaller datasets, classical ML
algorithms such as regressions, random forest, and SVM often outperform
deep networks.

(2) Deep Learning in practice is hard and expensive


Deep learning is still a very cutting edge technique. You can de>nitely get a
quick and easy solution as many people do, especially with widely available
APIs such as Clarifai and Google’s AutoML. But if you want to do something
fairly customized, such services won’t cut it. You’re a bit limited to doing
something at least slightly similar to what everyone else is doing, unless
you’re willing to spend the money on research…

Which is also expensive, not only because of the resources required to get
the data and computing power, but also hiring researchers. Deep learning
research is very hot right now and so all three of these expenses are very
inTated. You also end up with a very high overhead in that when doing
something so customized, you spend a lot of time just experimenting and
breaking things.

(3) Deep networks are not easily interpreted


Deep networks are very “black box” in that even now researchers do not
fully understand the “inside” of deep networks. They have high predictive
power but low interpretability. Hyper-parameters and network design are
also quite a challenge due to the lacking theoretical foundation.

There has been a lot of recent tools like saliency maps and activation
diXerences that work great for some domains, similar to the one shown in
the >gure below. But unfortunately they don’t transfer completely to all
applications. These tools are mainly well designed for making sure that you
are not over>tting your network to the dataset or focusing on particular
features that are spurious. It is still very diIcult to interpret per-feature
importance to the overall decision of the deep net.

Visualizations of features in a deep convolutional neural network

On the other hand, classical ML algorithms such as regression or random


forests are quite easy to interpret and understand due to the direct feature
engineering involved. In addition, tuning hyper-parameters and altering the
model designs is more straightforward since we have a more thorough
understanding of the data and underlying algorithms. These are
particularly important when the results of the network have to be translated
and delivered to the public or a non-technical audience. We can’t just say
“we sold that stock” or “we used this drug on that patient” because our deep
network said so. We need to know why. Unfortunately, all the evidence we
have for deep learning so far is empirical.

Like to learn?
Follow me on twitter where I post all about the latest and greatest AI,
Technology, and Science! Connect with me on LinkedIn too!

Sign up for The Variable


By Towards Data Science

Every Thursday, the Variable delivers the very best of Towards Data Science: from
hands-on tutorials and cutting-edge research to original features you don't want to
miss. Take a look.

Your email Get this newsletter

By signing up, you will create a Medium account if you don’t already have one. Review our Privacy Policy for more information
about our privacy practices.

260 2

Machine Learning Deep Learning Artificial Intelligence Technology Innovation

More from Towards Data Science Follow

Your home for data science. A Medium publication sharing concepts, ideas and
codes.

Félix Revert · Aug 8, 2018

Getting started with graph analysis in


Python with pandas and networkx
Graph analysis is not a new branch of data science, yet is not the usual “go-
to” method data scientists apply today. However there are some crazy
things graphs can do. Classic use cases range from fraud detection, to
recommendations, or social network analysis. A non-classic use case in NLP
deals with topic extraction (graph-of-words).

visualisation of a graph-of-words, where each community represent a different topic

Consider a fraud detection use case


You have a database of clients, and would like to know how they are
connected to each other. Especially, you know some clients are involved in
complex fraud structure, but visualizing the data at an individual level does
not bring out evidence of fraud. …

Read more · 5 min read

3.4K 6

Ross Burton · Aug 8, 2018

Exploring infections through data: Dengue


fever

A 1920s photograph of efforts to disperse standing water and thus decrease mosquito populations — Wikipedia (
https://tinyurl.com/y8zcj7te) Public Health Image Library (PHIL) — This media comes from the Centers for Disease
Control and Prevention’s Public Health Image Library (PHIL)

The year was 1780 and cases of “bilious remitting fever” were plaguing
Philadelphia. Described by founding father and physician Benjamin Rush in
1789 as “Breakbone Fever”, it is now widely believed this is the >rst
documented outbreak of what we now call Dengue fever. Rush noted that
his patients experienced a severe fever, pain in their head, back and limbs,
and in some cases hemorrhagic symptoms (what we now describe as
Dengue hemorrhagic fever). This is a disease that is usually self-limiting, in
some circumstances fatal, but in almost all cases traumatic for the infected
individual; a sudden onset of…

Read more · 11 min read

28

Debparna Pratiher · Aug 8, 2018

Deep Learning using Design Thinking


Too many people think that AI is the business solution to many problems
that we are facing today.

There is probably a neural network for a large enough dataset that can help
us make a decision to a problem and impact an user. But to use this data in a
eXective way to create a product we need to understand if the problem is
even real to begin with. We need to understand people because at the end
of the day, the companies that are consumer product driven solve problems
that people face.

To start oX, we need to ask…

Read more · 5 min read

264

Kamil Krzyk · Aug 8, 2018

Evaluating model performance during training process. (Source http://neuralnetworksanddeeplearning.com/)

Coding Deep Learning for Beginners —


Linear Regression (Part 2): Cost Function
This is the 4th article of series “Coding Deep Learning for Beginners”.
Here, you will be able to >nd links to all articles, agenda, and general
information about an estimated release date of next articles on the bottom of
the 1st article. They are also available in my open source portfolio —
MyRoadToAI, along with some mini-projects, presentations, tutorials and
links.

You can also read the article on my personal website, hosted with Jekyll in
order to improve readability (supporting code syntax highlighting, LaTeX
equations and more.

Recap
The last article has introduced the problem which will be solved after
Linear…

Read more · 10 min read

916 2

Sowmya Vivek · Aug 8, 2018

Model performance & cost functions for


classification models

A classi>cation model is a machine learning model which predicts a Y


variable which is categorical:
1. Will the employ leave the organisation or stay?

2. Does the patient have cancer or not?

3. Does this customer fall into high risk, medium risk or low risk?

4. Will the customer pay or default a loan?

A classi>cation model in which the Y variable can take only 2 values is


called a binary classi>er.

Model performance for classi>cation models is usually debatable in terms


of which model performance is most relevant, especially when the dataset
is imbalanced. …

Read more · 5 min read

73 3

Read more from Towards Data Science

More From Medium

Getting to know probability 7 Useful Tricks for Python Regex You


distributions Should Know
Cassie Kozyrkov in Towards Data Science Christopher Tao in Towards Data Science

15 Habits I Stole from Highly Effective Ten Advanced SQL Concepts You
Data Scientists Should Know for Data Science
Madison Hunter in Towards Data Science
Interviews
Terence Shin in Towards Data Science

6 Machine Learning Certificates to The flawless pipes of Python/ Pandas


Pursue in 2021 Dr. Gregor Scheithauer in Towards Data Science
Sara A. Metwalli in Towards Data Science

Jupyter: Get ready to ditch the IPython What Took Me So Long to Land a Data
kernel Scientist Job
Dimitris Poulopoulos in Towards Data Science Soner Yıldırım in Towards Data Science

About Help Legal

You might also like