You are on page 1of 21

What is Deep Learning?

Deep Learning is a subset of Machine Learning that uses mathematical functions


to map the input to the output. These functions can extract non-redundant
information or patterns from the data, which enables them to form a relationship
between the input and the output.
This is known as learning, and the process of learning is called training.
In traditional computer programming, input and a set of rules are combined
together to get the desired output. In machine learning and deep learning, input
and output are correlated to the rules.
These rules—when combined with new input—yield desired results.
What is Deep Learning?
Deep learning models use artificial neural networks or simply neural networks to
extract information.
Deep Learning was first theorized in the 1980s, but it has only become useful
recently because:
•It requires large amounts of labelled data.
•It requires significant computational power (high performing GPUs)
Neural Networks
The neural network is the heart of deep learning models, and it was initially
designed to mimic the working of the neurons in the human brain.
Here are its components.
Deep Learning vs. Machine Learning
Why is Deep Learning more powerful than traditional Machine Learning?
• Deep Learning can do everything what the machine learning does.
• Machine learning is useful when the dataset is small, which means that the
data is carefully preprocessed.
• Deep learning is extremely powerful when the dataset is large. It can learn any
complex patterns from the data and can draw accurate conclusions on its own.
• Deep Learning and Machine Learning are both capable of different types of
learning: Supervised Learning (labeled data), Unsupervised Learning
(unlabeled data), and Reinforcement Learning. But usually determined by the
size and complexity of the data.
Deep Learning vs. Machine Learning
•Machine learning requires data preprocessing, which involves human
intervention.

•The neural networks in deep learning are capable of extracting features;


hence no human intervention is required.

•Deep Learning can process unstructured data.

•Deep Learning is usually based on representative learning i.e., finding and


extracting vital information or patterns that represent the entire dataset.

•Deep learning is computationally expensive and time-consuming.


How does Deep Learning work?
• Deep Neural Networks have multiple layers of interconnected artificial
neurons or nodes that are stacked together
• There are three layers to a deep neural network: the input layer, hidden
layers, and the output layer.
• The data is fed into the input layer.
• Each node in the input layer ingests the data and passes it onto the next layer,
i.e., the hidden layers. These hidden layers increasingly extract features from
the given input layer and transform it using the linear function.
• These layers are called hidden layers because the parameters (weights and
biases) in each node are unknown; these layers add random parameters to
transform the data, each of which yields different output.
How does Deep Learning work?
• The output yielded from the hidden layers is then passed on to the final layer
called the output layer, where depending upon the task, it classifies, predicts,
or generates samples. This process is called forward propagation.
• In another process called backpropagation, an algorithm, like gradient
descent, calculates errors by taking the difference between the predicted output
and the original output.
• This error is then adjusted by fine-tuning the weights and biases of the
function by moving backward through the layers.
• With each iteration, the algorithm becomes gradually more accurate.
Types of neural networks
CNN
The Convolutional Neural Networks or CNNs are primarily used for tasks related
to computer vision or image processing.
CNNs are extremely good in modeling spatial data such as 2D or 3D images and
videos. They can extract features and patterns within an image, enabling tasks
such as image classification or object detection.
Types of neural networks
RNN
The Recurrent Neural Networks or RNN are primarily used to model sequential
data, such as text, audio, or any type of data that represents sequence or time.
They are often used in tasks related to natural language processing (NLP).
Types of neural networks
GAN
Generative adversarial networks or GANs are frameworks that are used for the
tasks related to unsupervised learning. This type of network essentially learns the
structure of the data, and patterns in a way that it can be used to generate new
examples, similar to that of the original dataset.
Deep Learning Limitations
1.Data availability
Deep learning models require a lot of data to learn the representation, structure,
distribution, and pattern of the data.
If there isn't enough varied data available, then the model will not learn well and
will lack generalization (it won't perform well on unseen data).
The model can only generalize well if it is trained on large amounts of data.

2. The complexity of the model


Designing a deep learning model is often a trial and error process.
A simple model is most likely to underfit, i.e. not able to extract information from
the training set, and a very complex model is most likely to overfit, i.e., not able to
generalize well on the test dataset.
Deep learning models will perform well when their complexity is appropriate to
the complexity of the data.
Deep Learning Limitations
3. Incapable of Multitasking
Deep neural networks are incapable of multitasking.
These models can only perform targeted tasks, i.e., process data on which they are
trained. For instance, a model trained on classifying cats and dogs will not classify
men and women.

4. Hardware dependence
Deep learning models are computationally expensive.
These models are so complex that a normal CPU will not be able to withstand the
computational complexity.
Deep Learning Applications
Deep Learning finds applications in:
•Speech recognition: Some of the familiar software like Apple’s Siri, Google’s
Alexa, and Microsoft Cortana are based on deep neural networks.

•Pattern recognition: Pattern recognition is very useful in medical and life


sciences. These algorithms can help radiologists to find tumor cells in the CT
scans.

•Natural language processing

•Recommender systems: Recommender systems are on almost every social


media platform these days from Instagram to YouTube and Netflix. These
companies use a recommendation system to recommend shows, videos, posts, and
stories based on users activities.
Real-life Deep Learning use cases
Healthcare
•Medical image analysis:

Medical images such as CT scans, MRI, and X-rays can sometimes be difficult to
interpret. Deep learning algorithms can help to find anomalies that are unseen to
the naked eye.
•Surgical robotics: There are times when a critical patient is unable to find a
surgeon; in such case and life-threatening conditions surgical robots can come to
the rescue. Such robots have a super human ability to repeat exact motions like
that of a trained surgeon.
Real-life Deep Learning use cases
Transportation
•Self-driving cars:
Self-driving cars are becoming one of the trending topics in the world right now.
Companies use deep learning as their core algorithm; these models can consume a
lot of data and enable these cars to navigate through roads while making correct
decisions through analyzing the roads and vehicles around them. These cars are so
advanced that they can even predict accidents.

•Smart cities: Smart cities can manage their resources efficiently and manage
traffic, public services, and disaster response. The way it works is that the input
from different sensors from all over the city can be used to collect data and a deep
learning system trained on that data can be used to predict different outputs based
upon the scenario.
Real-life Deep Learning use cases
Agriculture
•Robot picking: Deep learning can be used to enable robots that can classify and
pick crops. These robots can save time and increase the production rate as well.

•Crop and soil monitoring: Deep learning model trained on the crop and soil
condition data can be used to build a system that can effectively monitor crop and
soil help estimate yield.

•Livestock monitoring: Animals can move from one place to another, making
them difficult to monitor. Image annotation with deep learning can enable farmers
to track the location, predict the livestock's food needs, and monitor the rest cycle
to ensure that they are in good health.

•Plant disease and pest detection: Another useful area for deep learning in
agriculture is to classify plants suffering from the disease from healthy plants.
This type of system can help farmers take proper treatment of the plant before
they die. Furthermore, deep learning can also be used to detect pest infestation.
Greedy layer wise training
• Training deep neural networks with many layers was challenging.
• As the number of hidden layers is increased, the amount of error information
propagated back to earlier layers is dramatically reduced. This means that
weights in hidden layers close to the output layer are updated normally,
whereas weights in hidden layers close to the input layer are updated
minimally or not at all.
• This problem prevented the training of very deep neural networks and was
referred to as the vanishing gradient problem.
• An important milestone in the resurgence of neural networking that initially
allowed the development of deeper neural network models was the technique
of greedy layer-wise pretraining, often simply referred to as “pretraining.”
Greedy layer wise training
• Pretraining involves successively adding a new hidden layer to a model and
refitting, allowing the newly added model to learn the inputs from the existing
hidden layer, often while keeping the weights for the existing hidden layers
fixed. This gives the technique the name “layer-wise” as the model is trained
one layer at a time.
• The technique is referred to as “greedy” because the piecewise or layer-wise
approach to solving the harder problem of training a deep network.
• As an optimization process, dividing the training process into a succession of
layer-wise training processes is seen as a greedy shortcut that likely leads to an
aggregate of locally optimal solutions, a shortcut to a good enough global
solution.
Greedy layer wise training
• Pretraining is based on the assumption that it is easier to train a shallow
network instead of a deep network and contrives a layer-wise training process
that we are always only ever fitting a shallow model.
• The key benefits of pretraining are:
1.Simplified training process.
2.Facilitates the development of deeper networks.
3.Useful as a weight initialization scheme.
4. Perhaps lower generalization error.
• In general, pretraining may help both in terms of optimization and in terms of
generalization.
Greedy layer wise training
There are two main approaches to pretraining; they are:
1.Supervised greedy layer-wise pretraining.
2.Unsupervised greedy layer-wise pretraining.

• Supervised pretraining involves successively adding hidden layers to a model


trained on a supervised learning task.
• Unsupervised pretraining involves using the greedy layer-wise process to
build up an unsupervised autoencoder model, to which a supervised output
layer is later added.
• Unsupervised pretraining may be appropriate when you have a significantly
larger number of unlabeled examples that can be used to initialize a model.
prior to using a much smaller number of examples to fine tune the model
weights for a supervised task.
Greedy layer wise training
There are two main approaches to pretraining; they are:
1.Supervised greedy layer-wise pretraining.
2.Unsupervised greedy layer-wise pretraining.
• Supervised pretraining involves successively adding hidden layers to a model
trained on a supervised learning task.
• Unsupervised pretraining involves using the greedy layer-wise process to build
up an unsupervised autoencoder model, to which a supervised output layer is
later added.
• Unsupervised pretraining may be appropriate when you have a significantly
larger number of unlabelled examples that can be used to initialize a model
prior to using a much smaller number of examples to fine tune the model
weights for a supervised task.
• Better performance may be achieved using modern methods such as better
activation functions, weight initialization, variants of gradient descent, and
regularization methods.

You might also like