You are on page 1of 24

RELATIVE ANALYSIS ON ALGORITHMS AND APPLICATIONS

OF DEEP LEARNING

1. INTRODUCTION

Deep Learning is an artificial intelligence function that reproduces the mechanisms of the
human mind in processing records and evolving shapes to be used in selection construction. The
main objective of this paper is to provide a complete examination of deep learning Algorithms
and applications. Deep learning has detonated in the public alertness; primarily as inspecting
and analytical products fill our world, in the form of numerous human-centered smart-world
systems, with besieged advertisements, natural language supporters and interpreters, and
prototype self-driving vehicle systems (Sathiyamoorthi & Murali, 2012a). Yet to maximum, the
underlying appliances that enable such human-centered smart products to remain ambiguous. In
contrast, researchers across disciplines have been integrating deep learning into their research to
solve problems that could not have been approached before. Specifically, as a definite collection
of state of the art in deep learning research, It provides a broad orientation for those seeking a
primer on deep learning Algorithms and its various applications, platforms, and uses in a variety
of smart-world systems. Furthermore, In current key growths in the technology, and provide
insight into areas, in which deep learning can improve exploration, as well as highlight new areas
of research that have yet to see the application of deep learning, but could nevertheless benefit
hugely. Finally, this survey delivers a precious orientation for new deep learning practitioners, as
well as those seeking to innovate in the application of deep learning.
1.1Machine Learning

Machine Learning (ML) is the field where an agent is said to learn from the experience with
respect to some class of tasks and the performance measure P. The task could be answering
exams in a particular subject or it could be of diagnosing patients of a specific illness. As shown
in figure 1 given below, it is the subset of Artificial intelligence (AI) where it contains artificial
neurons and reacts to the given stimuli whereas machine learning uses statistical techniques for

1
knowledge discovery (Sathiyamoorthi & Murali, 2011a). Deep learning is the subset of machine
learning where it uses artificial neural networks for the learning process (Sathiyamoorthi &
Murali, 2012b).

Figure 1: Taxonomy of Knowledge Discovery


Further, machine learning can be categorized into supervised, unsupervised and
reinforcement learning. In any kind of tasks, machine learning involves three components. The
first component is, defining the set of tasks on which learning will take place and second is
setting up a performance measure P. Whether the learning is happening or not, defining
performance criteria P is mandatory in machine learning tasks. Consider an example of
answering questions in an exam then the performance criterion would be the number of marks
that you get. Similarly, consider an example of diagnosing the patient with a specific illness then
the performance measure would be the number of patients who didn't have an adverse reaction to
the given drugs. So, there exist different ways of defining various performance metrics
depending on what you are looking for within a given domain. The last important component of
machine learning is the experience (Sathiyamoorthi & Murali, 2011b). For an example,
experience in the case of writing exams could be writing more exams which means the better you
write, the better you get or it could be the number of patients’ s in the case of diagnosing
illnesses i.e. the more patients that you look at the better you become an expert in diagnosing
illness. Hence, these are three components involved in learning; class of tasks, performance
measure and well-defined experience. This kind of learning where you are learning to improve
your performance based on experience is known as inductive learning.
The chapter is organized as follows. Section 1 provides a quick introduction to various
paradigms of machine learning techniques such as supervised, unsupervised and reinforcement
learning. Section 2 discusses various applications of machine learning in various fields and then
concludes the whole chapter with research insights (Sathiyamoorthi & Murali, 2012c).

1.2 Machine Learning Techniques


There are various machine learning paradigms and the first one is supervised learning
where one learns from input to output map. For example, it could be a description of the patient

2
who comes to the clinic and the output would be whether the patient has a certain disease or not
in the case of diagnosing patients. Similarly, take an example of writing the exam where the
input could be some kind of equation then output would be the answer to the question or it could
be a true or false question i.e. it will give you a description of the question then you have to state
whether it is true or false as the output. So, the essential part of supervised learning is mapping
from the given input to the required output. If the output that you are looking for happens to be a
categorical output such as whether he has a disease or does not have a disease or whether the
answer is true or false then the learning is called supervised learning. If the output happens to be
a continuous value like how long will this product last before it fails right or what is the expected
rainfall tomorrow then those kinds of problems would be called as regression problems
(Sathiyamoorthi & Murali, 2012d).. Thus, classification and regression are called classes of the
supervised learning process as shown in figure 2.

Figure 2: Supervised Learning Process


The second paradigm is known as unsupervised learning problems shown in figure 3
where the input to output is not required. The main goal is not to produce an output in response
to the given input indeed it tries to discover some patterns out of it. Therefore, in unsupervised
learning, there is no real desired output that we are looking for instead it looks for finding closely
related patterns in the data. Clustering is one such task where it tries to find cohesive groups
among the given input pattern. For example, one might be looking at customers who come to the
shop and want to figure out if they are into different categories of customers like college students
or IT professionals so on so forth. The other popular unsupervised learning paradigm is known as
association rule mining or frequent pattern mining where one is interested in finding a frequent
co-occurrence of items in the data that is given to them i.e. whenever A comes to the shop B also
comes to the shop. Therefore, one can learn these kinds of the relationship via associations
between data.

Figure 3: Unsupervised Learning Process


The third form of learning is called as reinforcement learning. It's neither supervised nor
unsupervised in nature. In reinforcement learning, you have an agent who is acting in an
environment and you want to figure out what actions the agent must take at every step and the

3
action that the agent takes is based on the rewards or penalties that the agent gets in different
states.
Apart from these three types of learning, one more learning is also possible which is
called as semi-supervised learning shown in figure 4. It is the combination of supervised and
unsupervised learning i.e. you have some labeled training data and you also have a larger amount
of unlabeled training data and you can try to come up with some learning out of them that can
work even when the training data is limited features.

Figure 4: Semi -supervised Learning Process


Irrespective of the domain and the type of leaning, every task needs to have a
performance measure. In classification, the performance measure would be classification error
i.e. how many instances are misclassified to the total number of instances. Similarly, the
prediction error is supposed to be a performance measure in regression i.e. if I said, it's going to
rain like 23 millimeters and then it would end up raining like 49 centimeters then this huge
difference in actual and predicted value is called prediction error. In the case of clustering, it is a
little hard to define performance measures as we don't know what a good clustering algorithm is
and don't know how to measure the quality of clusters. So, there exist different kinds of
clustering measures and so one of the measures is scattered or spread of the cluster that
essentially tells you how to spread out the points that belong to a single group. Thus, good
clustering algorithms should minimize intra-cluster distance and maximize inter-cluster distance.
Association rule mining use a variety of measures called support and confidence whereas
reinforcement learning tries to minimize the cost to accrue while controlling the system. There
are several challenges existing when trying to build a machine learning solution to the given
problem and few of these are given below.
The first issue is about how good is a model and type of performance measures used.
Most of the measures discussed above were finds to be insufficient and there are other practical
considerations that come into play such as user skills, experience, etc. while selecting a model
and measures. The second issue is of course the presence of noisy and missing data. The
presence of these kinds of data leads to an error in the predicted value. Suppose medical data is
recorded as 225. So what does that mean it could be 225 days in which case it is a reasonable
number it could be twenty two points five years again is a reasonable number or twenty two

4
points five months is reasonable but if it is 225 years it's not a reasonable number so there's
something wrong in the data. Finally, the biggest challenge is the size of the dataset since
algorithms perform well when data is large but not all.
1.3 Algorithms Grouped By Similarity

Algorithms are typically sorted by similarity in terms of their operation (how they
work). For instance, tree-based ways, and neural network impressed ways. This is
a helpful grouping technique; however, it's not good. There are still algorithms that might even
as simply match into multiple classes like learning vector division that's each a neural
network impressed technique Associate in nursing an instance-based technique. There also
are classes that have a constant name that describe the matter and also
the category of algorithmic rules like Regression and agglomeration.

We may handle these cases by listing algorithms doubly or


by choosing the cluster that subjectively is that the “best” match. The latter approach is
preferred as one of not duplicating algorithms to stay things easy. In this section,
several popular machine learning algorithms are listed and classified the approach that
the most intuitive.
The list isn't thoroughgoing in either the teams or the algorithms; it can be helpful to
you to induce a thought of the lay of the land.
Please Note: there's a robust bias towards algorithms used for classification and
regression, the 2 most rife supervised machine learning issues you may encounter. If you
recognize of ANN rule or a bunch of algorithms unlisted, place it within the comments and
share it with North American country.

A. Regression Algorithms

Regression worries with modeling the connection between variables that are iteratively
refined employing a life of error within the predictions created by the model.
Regression strategies stand a workhorse of statistics and are co-opted into applied
math machine learning. This might be confusing as a result of we are able to use regression
to ask category the category of the downside and the class of formula. Really,
regression may be a method.

5
Figure 5: Regression Algorithms

The most popular regression algorithms are shown in figure 5:

o Ordinary Least Squares Regression (OLSR)


o Linear Regression
o Logistic Regression
o Stepwise Regression
o Multivariate Adaptive Regression Splines (MARS)
o Locally Estimated Scatterplot Smoothing (LOESS)
B. Instance-based Algorithms
Instance-based learning model could be a call downside with instances or samples
of coaching knowledge that are deemed necessary or needed to the model.
Such strategies usually build up information of example knowledge and compare
new knowledge to the information employing a similarity live so as to search out the most
effective match and build a prediction. For this reason, instance-based strategies also
are referred to as winner-take-all strategies and memory-based learning. Focus is placed
on the illustration of the keep instances and similarity measures used between instances.

Figure 6 : Instance-based Algorithms

The most popular instance-based algorithms are shown in figure 6:

o k-Nearest Neighbor (kNN)


o Learning Vector Quantization (LVQ)
o Self-Organizing Map (SOM)
o Locally Weighted Learning (LWL)
C. Regularization Algorithms

An extension made to another method (typically regression methods) that penalizes


models based on their complexity, favoring simpler models that are also better at generalizing.

6
The regularization algorithms separately listed here because they are popular, powerful and
generally simple modifications made to other methods.

Figure 7: Regularization Algorithms

The most popular regularization algorithms are shown in figure 7:

o Ridge Regression
o Least Absolute Shrinkage and Selection Operator (LASSO)
o Elastic Net
o Least-Angle Regression (LARS)
D. Decision Tree Algorithms

Decision tree methods construct a model of decisions made based on actual values of
attributes in the data. Decisions fork in tree structures until a prediction decision is made for a
given record. Decision trees are trained on data for classification and regression problems.
Decision trees are often fast and accurate and a big favorite in machine learning.

Figure 8: Decision Tree Algorithms

The most popular decision tree algorithms are shown in figure 8:

o Classification and Regression Tree (CART)


o Iterative Dichotomiser 3 (ID3)
o C4.5 and C5.0 (different versions of a powerful approach)
o Chi-squared Automatic Interaction Detection (CHAID)
o Decision Stump
o M5
o Conditional Decision Trees
E. Bayesian Algorithms

Bayesian methods are those that explicitly apply Bayes’ Theorem for problems such as
classification and regression.

7
Figure 9: Bayesian Algorithms

The most popular Bayesian algorithms are shown in figure 9:

o Naive Bayes
o Gaussian Naive Bayes
o Multinomial Naive Bayes
o Averaged One-Dependence Estimators (AODE)
o Bayesian Belief Network (BBN)
o Bayesian Network (BN)
F. Clustering Algorithms

Clustering, like regression, describes the class of problem and the class of methods.
Clustering strategies are generally organized by modeling approaches like centroid-based
and hierarchical. All strategies are involved with exploitation the inherent structures within
the information to best organize the info into teams of most commonality (Sathiyamoorthi &
Murali, 2012).

Figure 10 : Clustering Algorithms

The most popular clustering algorithms are shown in figure 10:

 K-Means
K-means is one of the simplest unsupervised learning algorithms that resolve the well-
known clustering problem. The way follows a simple and easy way to classify a given data
set through a positive number of clusters (assume k clusters) fixed Apriori. The main idea is
to define k centers, one for each cluster. These centers should be placed in a
cunning way because of different location causes a different result. So, the better choice is to
place them as much as possible far away from each other. The next step is to take each point
belonging to a given data set and associate it to the nearest center (Sathiyamoorthi & Murali,
2013). When no point is pending, the first step is completed and an early group age is done. At
this point, we need to re-calculate k new centroids as the barycenter of the clusters resulting from

8
the previous step. A new binding has to be done between the same data set points and the nearest
new center after we have these k new centroids. A loop has been generated. As a result of this
loop, we may notice that the k centers change their location step by step until no more
changes are done or in other words, centers do not move anymore (Sathiyamoorthi & Murali,
2009b).
 K-Medians

o The goal of K-Median clustering, like KNN clustering, is to discrete the data into distinct
groups based on the differences in the data. Thus, upon completion, the analyst will be
left with k-distinct groups with distinguishing characteristics.
o The goal of K-Median clustering in this instance is to form k-clusters of the
adenocarcinoma data, each of which will have dissimilar survival times or rates.

 Expectation Maximization (EM)


The EM (expectation maximization) technique is similar to the K-Means technique. The
basic operation of K-Means clustering algorithms is relatively simple: Given a fixed number
of k clusters, assign observations to those clusters so that the means across clusters (for all
variables) are as different from each other as possible. The EM algorithm extends this basic
approach to clustering in two important ways:
Instead of assigning examples to clusters to maximize the differences in means for
continuous variables, the EM clustering algorithm computes probabilities of cluster memberships
based on one or more probability distributions. The goal of the clustering algorithm then is to
maximize the overall probability or likelihood of the data, given the (final) clusters
(Sathiyamoorthi & Murali, 2009a)
 Hierarchical Clustering

o Hierarchical clustering algorithm is of two types:


o Agglomerative Hierarchical clustering algorithm or AGNES (agglomerative nesting) and
o Divisive Hierarchical clustering algorithm or DIANA (divisive analysis).

G. Association Rule Mining

9
Association rule learning strategies extract rules that
best justify ascertained relationships between variables in the information. These
rules will discover necessary and commercially helpful associations in massive two-
dimensional datasets that may be exploited by a corporation.

Figure 11: Association Rule Learning Algorithms

The most popular association rule learning algorithms are shown in figure 11:

o Apriori algorithm
o Eclat algorithm
H. Artificial Neural Network Algorithms

Artificial Neural Networks are models that are galvanized by the structure
and/or operate of biological neural networks (Sathiyamoorthi, 2016). They are a category of
pattern matching that is usually used for regression and
classification issues however are extremely a colossal subfield comprised
of many algorithms and variations for all manner of downside varieties.
Note that I even have separated out Deep Learning from neural networks due to the
huge growth and recognition within the field. Here we tend to are involved with a lot
of classical strategies.

Figure 12 : Artificial Neural Network Algorithms

The most popular artificial neural network algorithms are shown in figure 12:

o Perceptron
o Back-Propagation
o Hopfield Network
o Radial Basis Function Network (RBFN)

10
2. DEEP LEARNING

Deep learning is part of a broader family of machine learning methods and it is based
on learning data representations, as opposed to task-specific algorithms. Learning can
be supervised, semi-supervised or unsupervised. Deep learning architectures have been applied
to fields including computer vision, speech recognition, natural language processing(NLP), audio
recognition, medical image analysis, social network filtering, bioinformatics, drug design ,
and board game programs, where they have made results comparable to and in some cases
greater to human experts. Learning is an intellectual process of knowledge and behavior
acquisition. The learning process can be classified into five categories known as object
identification, functional regression, cluster classification, behavior generation, and knowledge
acquisition. The latest discovery in knowledge science by Wang discovered that the basic unit of
knowledge is a binary relation as that of bit for information and data. A fundamental challenge to
knowledge learning different from those of deep and recurring neural network technologies has
led to the emergence of the field of cognitive machine learning on the basis of recent
breakthroughs in mathematical engineering.

Deep learning models are abstrusely inspired by information processing and


communication patterns in biological nervous systems yet have various differences from the
structural and functional properties of biological brains especially human brains, which make
them irreconcilable with neuroscience evidences. Most modern deep learning models are based
on an artificial neural network, although they can also include latent variables organized layer-
wise in deep reproductive models such as the nodes in deep belief networks etc., In deep
learning, each level acquires to transform its input data into a slightly more abstract and
composite representation. In applications of image recognition, the input may be a matrix of
pixels; the first representational layer may abstract the pixels and encode edges, the second layer
may comprise and encrypt arrangements of edges, the third layer may encode a nose and eyes
and the fourth layer may recognize that the image contains a face. Essentially, a deep learning
process can learn which features to optimally place in which level on its own. The "deep" in
"deep learning" refers to the data is transformed through number of layers. More precisely, deep
learning system has a substantial Credit Assignment Path (CAP) depth.

11
Deep learning helps to separate these abstractions and pick out which features improve
performance. For supervised learning tasks, deep learning methods obviate feature engineering,
by translating the data into compact intermediate representations akin to principal components,
and derive layered structures that remove redundancy in representation. Deep learning
algorithms can be put on to unsupervised learning tasks. This is an important benefit because
unlabeled data are richer than labeled data. Examples of deep structures that can be trained in an
unsupervised manner are neural history compressors and deep belief networks. Owing to the
decreasing cost of computing power, the profusion of data, and better algorithms, AI has entered
into its new developmental stage and AI 2.0 is developing rapidly. Deep learning (DL),
reinforcement learning (RL) and their combination-deep reinforcement learning (DRL) are
representative methods and relatively mature methods in the family of AI 2.0.

2.1 Literature survey on Learning Algorithms

The researcher (Mohamed, 2008) has improved the performance of the deep learning
using a proposed algorithm called Random Forest HTM Cortical Learning Algorithm RFHTMC.
The proposed algorithm is a combined version from Random Forest and Hierarchical Temporal
Memory (HTM) Cortical Learning Algorithm. The methodology for enlightening the
performance of Deep Learning depends on the concept of minimizing the mean absolute
percentage error which is a sign of the high performance of the forecast procedure. In addition to
the overlap duty cycle which its high percentage is a sign of the speed of the processing
operation of the classifier. The outcomes depict that the proposed set of rules reduces the
absolute percent errors by using half of the value and increase the percentage of the overlap duty
cycle with 15%. The proposed set of rules can increase the overall performance of the deep
gaining knowledge of system by means of the preceding values and it within the identical time
can maintain the values of active duty cycle. Also, it sustains the mixed integer representation of
input data units and the durability of the system. Finally, they have concluded that Deep learning
is a very important field in machine learning victimization
Artificial Intelligence.
The victimization of those systems depends on its
behavior and performance. The most objective of this paper is to
improve the performance of this sort employing a combined version from

12
Random Forest and HTM plant tissue learning algorithmic program. The
proposed algorithmic program is named RFHTMC. The results depict that
the planned algorithmic program will cut back the mean absolute percentage error by 0.5.
And will increase the overlap duty cycle by 15%. The planned set of rules will increase the
performance of the deep gaining data of system by
means of the preceding values. And it among the identical time
can maintain the values of the active duty cycle. Also, it maintains
the mixed number illustration of computer file units and therefore the
durability of the system.
Researchers (Xiaoxiao and Satinder, 2017) have presented a better real-time Atari game
playing agent than DQN. The combination of new Reinforcement Learning and Deep Learning
approaches holds the promise of making important progress on challenging applications
requiring both rich perception and policy-selection. The Arcade Learning Environment (ALE)
provides a set of Atari games that signify a useful benchmark set of such applications. A new
breakthrough in combining model-free reinforcement learning with deep learning, called DQN,
achieves the best real time agents thus far. Planning-based approaches achieve far higher scores
than the best model-free approaches, but they exploit information that is not available to human
players, and they are orders of magnitude slower than needed for real-time play. Our main goal is
to build a better real-time Atari game playing agent than Deep Q-Learning (DQN). The proposed
system with new agents based on this idea and show that they outperform DQN.
Researchers (Woochul and Daeyeon, 2018) have considered important efforts to progress
lightweight and highly efficient deep learning inference mechanisms for resource-constrained
mobile and IoT devices. Some approaches propose a hardware-based accelerator, and some
approaches propose to reduce the amount of computation of deep learning models using various
model compression techniques. Even though these efforts have demonstrated important gains in
performance and efficiency, they are not aware of the Quality-of-Service (QoS) requests of
various IoT applications, and, hence manifest unpredictable ‘best-effort’ performance in terms of
inference latency, power consumption, resource usage, etc. In IoT devices with temporal
constraints, such unpredictability might result in undesirable effects such as compromising
safety. A novel deep learning inference runtime called, DeepRT. Unlike previous inference

13
accelerators, DeepRT focuses on supporting predictable inference performance both temporally
and spatially.
Authors to (Dongxia et al., 2018) introduced the concept and status quo of the above
three methods, reviews their potential for application in smart grids, and provides an overview of
the research work on their application in smart grids.
Smart grids are the evolving trend of power systems and they have involved much
attention all over the world. Due to their complexities, and the ambiguity of the smart grid and
the high volume of information being collected, artificial intelligence techniques represent some
of the enabling technologies for its future development and success. Owing to the decreasing
cost of computing power, the profusion of data, and better algorithms, AI has entered its new
developmental stage and AI 2.0 is developing rapidly. Deep learning (DL), Reinforcement
Learning (RL) and their combination-deep reinforcement learning (DRL) are illustrative methods
and relatively established methods in the family of AI 2.0.
E. Balance the Loss: Improving Deep Hash via Loss Weighting and Semantic
Preserving
Knowledge to hash is widely used in Approximate Nearest-Neighbor (ANN) search.
However, traditional hash learning methods, which split the hashing into two parts: feature
extraction and hash function learning, usually result in a low retrieval accuracy. Although
existing deep learning-based hashing methods can improve hashing quality by coupling feature
learning and hash encoding, they are always affected by the positive-negative sample imbalance
problem. It often deteriorates the performance of the generated hash code. Propose an end-to-end
deep hashing framework, in which a weighted pairwise loss function is employed to alleviate the
sample imbalance problem. The loss generated by the positive pairs and negative pairs is given
different weights automatically. Moreover, we integrate a classification network into the hashing
framework, which can preserve the semantic information by making sure the generated hash
codes are also optimal for classification. Comparison experimentations are conducted on two
benchmark datasets to demonstrate the performance of our proposed approach.

The authors (Xin et al., 2016) provided a forum for different research efforts towards a
new machine learning theory and algorithms with applications to autonomous systems.
Autonomous systems are a class of intelligent systems that can realize autonomous sensing,

14
modeling, decision-making, and control in uncertain, dynamic environments. Typical examples
include autonomous vehicles and intelligent software agents in networks. In the past decade,
theory and applications of autonomous systems have been investigated from multidisciplinary
perspectives including machine learning, pattern recognition, robotics, and intelligent control. To
realize adaptive sensing, modeling, planning, and control for autonomous systems, various
machine learning methods play an important role in discovering knowledge from observed data.
In this context, due to the increasing complexity of real-world applications, it is necessary for
autonomous systems to have improved learning abilities such as online learning, learning from
imbalanced data, for sensing, planning, and motion control. Efforts to address these difficulties
include incremental learning from stream data; semis supervised learning, online reinforcement
learning (RL), and approximate dynamic programming (ADP).

3. DEEP LEARNING ALGORITHMS AND APPLICATIONS

3.1 Categorization of Deep Learning


It provides a categorical review of deep learning architectures by learning mechanism and
learning output task and provides brief descriptions of the many algorithmic implementations of
each. The main learning mechanisms are reinforcement learning, supervised learning and
unsupervised learning, and. In general, learning mechanisms are categorized by the type of input
data that they operate upon. Output tasks include classification, regression, dimensionality
reduction, clustering, and density estimation (Fadlullah, 2017).
A. Supervised Learning
In Supervised learning, the requirement of data examined is clearly labeled, and thus the
result of the output can be supervised or classified as precise or incorrect. In particular, the
predictive Mechanism is used by supervised learning, in which a portion of the data is learned
upon (otherwise known as the training set), another portion is used to validating the trained
model (cross-validation), and the remainder is used to determine the accuracy and effectiveness
in prediction. Though accuracy is an important metric, other statistical mechanisms, such as
precision, recall, and F1 score, are used to assess the ability of a trained model to generalize to
new data. Classification and regression is the two primary learning jobs in supervised learning.
B. Classification

15
In classification, the output of the learning task will be of a finite set of classes. The
classification process can be done in the form of binary classification of only two classes (0 or
1), multi-class classification resulting in one class out of a set of three or more total classes (red,
green, blue, etc.), as multi-label classification, where objects can belong to multiple binary
classes (red or not red, and car or not car), and even as all pairs classification, in which every
class in the finite set is directly associated with every other class in a binary method.
C. Regression
In contrast to classification, the output of regression learning is one or more continuous-
valued numbers. Regression analysis is a suitable mechanism to provide scored labels
correspondent to multi-label classification, where each item of a set has a probability of
belonging (i.e., 0.997 red, 0.320 green, 0.008 blue). Regression has been applied in various areas,
including monocular image object recognition for outdoor localization (Naseer &Burgard, 2017)
among others.
3.2 Unsupervised Learning
In unsupervised learning, datasets provided as input for machine learning are not labeled
in any way that determines a correct or incorrect result (Sathiyamoorthi & Ramya, 2014).
Instead, the result may achieve some broader anticipated goal, be judged on the ability to find
approximately that is easily human-discernible or provide a complex application of a statistical
function to extract an intended value. However, the ability to extract the compressed
representation accurately may still need to be tested to determine the appropriateness of the
implementation (Sathiyamoorthi, 2017)
A. Dimensionality Reduction
Dimensionality reduction can be carried out in various ways, including different forms of
component and discriminant analysis. Examples of dimensionality reduction include the
reduction of sequential data, such as video frames, to reduce noisy or redundant data while
maintaining important features of the original data, or the use of deep belief networks to reduce
the dimensionality of hyper spectral (400-2500 nm) images of landscapes to determine plant life
content (Jati et al.,2016).
B. Clustering

16
Clustering algorithms are used to statistically group data. In general, this occurs through
the interchanging selection of cluster centroids, and cluster membership. For example, k-means
and fuzzy c-means clustering utilize the least mean square error of the distances between clusters
and centroids (Senbagavalli & Tholkappia, 2014). Later, fuzzing allows data membership in
multiple cluster centroids, making the edges of the clusters ‘‘fuzzy.’’ Other clustering algorithms
consume the Gaussian Mixture Model (GMM), or other statistical and probabilistic mechanisms,
instead of Euclidean Distance, to make cluster selection (Yang et al., 2016).
C. Density Estimation
Density estimation, in general, is the statistical extraction or approximation of features of
data distribution, such as the extraction of densities of subgroups of data to evaluate correlations.
Density estimation Examples includes the estimation of power spectral density for noise
reduction in binaural assisted listening devices, and intersection vehicle traffic density estimation
utilizing CNN on heterogeneous distributed video (Honglak et al., 2017).
D. Reinforcement Learning
Reinforcement learning can be considered as an intermediate between supervised and
unsupervised learning, because, though data is not explicitly labeled, a reward is supplied upon
the execution of an action. The learning architecture in reinforcement learning cooperates with
the environment directly, such that a change in the environment returns a specific reward. The
goal of the reinforcement learning system is to increase the return of every state transition by
learning the best actions to take at each given state. This is embodied by the perception-action-
learning loop, as demonstrated in Fig. 1. This loop can follow for infinite time, or can be applied
in sessions, to acquire to increase the outcome of each session.
a) Policy Search
Policy search can be carried out by gradient-based (via back propagation) or gradient-free
(evolutionary) methods, to directly search for an optimal policy. These output parameters for a
probability distribution, either for continuous or discreet actions, resulting in a stochastic policy.
b) Value Function
Value function methods operate by estimating the expected return of being in a given
state, attempting to select an optimal policy, which chooses the action that exploits the expected
value given all actions for a given state. The policy can be better-quality by iterative evaluation

17
and update of the value function estimate. The state-action value function, otherwise known as
the quality function, is the source of Q-learning.
3.3 Deep Learning Algorithms and Frameworks
Different deep learning algorithms help improve the learning performance, broaden the
scope of Applications, and simplify the calculation process. However, the extremely long
training time of the deep learning models remains a major problem for the researchers.
Additionally, the classification accuracy can be drastically enhanced by increasing the size of the
training data set and sample parameters. In order to accelerate deep learning processing, several
advanced techniques are proposed in the literature. Deep learning frameworks combine the
implementation of modularized deep learning algorithms, optimization techniques, distribution
techniques, and support to infrastructures.
A. Unsupervised and Transfer Learning
In recent years, the benefit of learning reusable features using unsupervised techniques
has shown promising results in different applications. In the recent few years, generative models
such as GANs and VAEs have become dominant techniques for unsupervised deep learning.
This network is based on CNN’s and has shown its supremacy as unsupervised learning in visual
data analysis. In another work, a deep sparse Auto encoder is trained on a very large-scale image
dataset to learn features. This network generates a high-level feature extractor from unlabeled
data, which can be used for face detection in an unsupervised manner [(Guyon et al., 2012;
Silver, 2016). The generated features are also discriminative enough to detect other high-level
objects like animal faces or human bodies. In practice, very few people have the luxury of
accessing very high-speed GPUs and powerful hardware to train a very deep network from
scratch in a reasonable time. Therefore, pertaining to a deep network (e.g., CNN) on large-scale
datasets (e.g., ImageNet) is very common.
B. Online Learning
Usually, the network topologies and architectures in deep learning is time static that is
they are predefined before the process starts and are also time invariant. This restriction on time
complexity poses a serious challenge when the data is streamed online. Online learning
previously
Came into mainstream research, but only a modest advancement has been observed in online
deep learning. Conventionally, DNNs are built upon the Stochastic Gradient Descent (SGD)

18
approach in which the training samples are used individually to update the model parameters
with a known label
C. Optimization Techniques in Deep Learning
Training a DNN is an optimization process, i.e., finding the parameters in the network
that minimize the loss function. In practice, the SGD method is a fundamental algorithm applied
to deep learning, which iteratively adjusts the parameters based on the gradient for each training
sample. The computational complexity of SGD is lower than that of the original gradient descent
Method, in which the whole dataset is considered every time the parameters are updated. In the
learning process, the updating speed is controlled by the hyper parameter learning rate. Lower
learning rates will eventually lead to an optimal state after a long time, while higher learning
rates decay the loss faster but may cause fluctuations during the training.
4. APPLICATIONS OF DEEP LEARNING
The primary developments have been in the application of deep learning toward
multimedia analysis, including image, audio, and natural language processing, which has
afforded significant leaps in the state of the art for autonomous systems. Indeed, machine
learning is primarily concerned with data fitting, the primary uses of which are optimization,
discrimination, and prediction. In addition, improvements in big data and cloud computing have
created the potential for machine learning to flourish, enabling the requisite data collection and
dissemination, as well as the computational capacity to execute deep models. The presence of
data, and the nature of its potential have directly required more accurate, generalized, and
efficient learning mechanisms. As shown in Figure 13, categorize the deep learning applications
into two groups: mature applications and emerging applications.

Figure 13: Applications of Deep Learning

Some of the Applications of Deep Learning are listed below:


A. Image and Video Recognition/Classification
In the Investigation of Deep Learning, image and video processing, recognition, and
detection has seen explosive growth in recent years. In Classification point of view, it is to check
the overall dataset has been classified whether the data Element is positive or negative or neutral.
B. Text Analysis and Natural Language Processing
19
Text and natural language processing give the potential for on-the-fly language
translation and the communication of humans and computer systems via natural speech. The
Related process of sentiment classification through text analysis, there has been a number of
research determinations.
C. Autonomous Systems and Robotics
Deep learning has been widely applied to enable diverse sensory input to assist in the
workings of autonomous machines in manufacturing and commercial spaces, and these cannot be
wholly removed from video recognition the most cases. In the area of robotic manipulation, deep
learning has produced significant strides to the rapid training of robotic arms for monotonous
manufacturing tasks in a variety of ways (Mohamed, 2018).
D. Medical Diagnostics
Due to High Advancements in image analysis, medical diagnostics have benefited
significantly from the rapid improvements in deep learning. Considerable work has been done in
the direction of improving the detection of diseases and other defects from MRI images, CT
scans, etc. In addition, IoT devices for medical applications can provide autonomous monitoring
of patients on medical populations. Medical diagnosis is the process of defining which disease or
condition explains a person's symptoms and signs. It is most often referred to as diagnosis with
the medical context being implicit. The information required for diagnosis is typically collected
from a history and physical examination of the person seeking medical care. Often, one or
more diagnostic procedures, such as diagnostic tests, are also done during the process.
Sometimes the posthumous diagnosis is considered a kind of medical diagnosis.

Diagnosis is often challenging, because many signs and symptoms are nonspecific. For
example, redness of the skin (erythema), by itself, is a sign of many disorders and thus does not
tell the healthcare professional what is wrong. Thus differential diagnosis, in which several
possible explanations are compared and contrasted, must be performed. This involves
the correlation of various pieces of information followed by the recognition and differentiation of
patterns. Occasionally the process is made easy by a sign or symptom (or a group of several) that
is pathognomonic. Diagnosis is a major component of the procedure of a doctor's visit. From the
point of view of statistics, the diagnostic procedure involves classification tests.

E. Cyber Security

20
Given the unparalleled utilization of network-connected devices, and the significant
dependence on information communication technology throughout the world, it is no surprise
that nefarious users attempt and occasionally succeed in disrupting credentials, bypassing
security systems, or simply attacking hosts with network traffic. Like existing research applying
machine learning to conduct accurate cyber situation awareness, the use of deep learning
technologies for cyber security analysis and intrusion detection is highly relevant, as the majority
of attacks use families of interfering software that can be observed and classified. Cyber security
is the protection of Internet-connected systems, including hardware, software and data, from
cyber-attacks.
In a computing context, security includes cyber security and physical securities both are
used by creativities to protect against unauthorized access to data centers and other computerized
systems. Information security is designed to maintain the confidentiality, integrity and
availability of data, is a subset of cyber security.
Elements of cyber security

Ensuring cyber security requires the coordination of efforts throughout an information


system, which includes:

o Application security

o Information security

o Network security

o Disaster recovery/business continuity planning

o Operational security

o End-user education

CONCLUSION
Deep learning is a technology that continues to mature and has clearly been applied to a
multitude of applications and domains to great effect. While the full-scale adoption of deep
learning technologies in the industry is ongoing, measured steps should be taken to ensure

21
appropriate application of deep learning, as the subversion of deep learning models may result in
important loss of monetary value, trust, or even life in extreme cases. In this survey provided an
overview of deep learning algorithms and applications. The proposed system introduced many
common and widely adopted deep learning frameworks, and considered them from the
perspectives of design, extensibility and comparative efficacy. Additionally, investigated
thoroughly the state-of-the-art in deep learning research like autonomous systems, multimedia
processing, medical diagnostics, and algorithmic enhancement ,etc., This research work provided
a valuable reference for researchers and computer science practitioners alike in considering the
Algorithms and applications of deep learning, and provokes interest in areas that desperately
need further consideration.

REFERENCES
Dongxia Z., Xiaoqing H. and Chunyu D. (2018), Review on the Research and Practice of Deep
Learning and Reinforcement Learning in Smart Grids, CSEE Journal of Power and Energy
Systems, VOL. 4, NO. 3.

Fadlullah Z. M. (2017), State-of-the-art deep learning: Evolving machine intelligence toward


tomorrow's intelligent network traffic control systems, IEEE Commun. Surveys Tuts., Vol. 19,
no. 4, pp.2432_2455.

Guyon I., Dror G., Lemaire. V, Taylor G. and Silver D. (2012), Unsupervised and Transfer
Learning Challenge: a Deep Learning Approach, Workshop and Conference Proceedings pp: 97–
111.

Honglak L., Richard L., XiaoshiW., Satinder S. and Xiaoshi W.(2017), Deep Learning for Real-
Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning,

Jati D., Mantau A. J. and Wasito I.(2016),Dimensionality reduction using deep belief network in
big data case study: Hyperspectral image classification, in Proc. Int.Workshop Big Data Inf.
Secur. (IWBIS), pp. 71-76.

Mohamed A.A, (2018), Improving Deep Learning Performance Using Random Forest HTM
Cortical Learning Algorithm.

22
Naseer T and Burgard W.(2017), Deep regression for monocular camera-based 6-DoF global
localization in outdoor environments, in Proc. IEEE/RSJ Int. Conf. Intell. Robots Syst. (IROS),
pp.1525_1530.

Sathiyamoorthi, V. and Murali Bhaskaran, V. (2011a), Improving the Performance of Web Page
Retrieval through Pre-Fetching and Caching. European Journal of Scientific Research,Vol. 66,
No. 2, 207-217.

Sathiyamoorthi, V. and Murali Bhaskaran, V. (2011b), Data Pre-Processing Techniques for Pre-
Fetching and Caching of Web Data through Proxy Server. International journal of Computer
Science and Network security, Vol. 11, No. 11, 92-98.

Sathiyamoorthi, V. and Murali Bhaskaran, V. (2012), Optimizing the Web Cache performance
by Clustering Based Pre-Fetching Technique Using Modified ART1. International Journal of
Computer Application, 44(1), 51-60.

Sathiyamoorthi, V. and Murali Bhaskaran, V. (2013), Novel Approaches for Integrating MART1
Clustering Based Pre-Fetching Technique with Web Caching. International Journal of
Information Technology and Web Engineering, 8(2), 18-32.

Sathiyamoorthi, V. (2016), A Novel Cache Replacement Policy for Web Proxy Caching System
Using Web Usage Mining. International Journal of Information Technology and Web
Engineering. Volume 11, Issue 2, 1-12.

Sathiyamoorthi, V. and Murali Bhaskaran V.(2012a) , Improving The Performance Of Web Page
Retrieval Through Pre-Fetching And Caching”, European Journal of Scientific Research, ISSN:
1450-216X, Vol. 66, No. 2, pp. 207-217. Vol. 66, No.2, pp.207-217.

Sathiyamoorthi, V. and Murali Bhaskaran V. (2012b), Data Pre-Processing Techniques for Pre-
Fetching and Caching of Web Data through Proxy Server”, International journal of Computer
Science and Network security, ISSN: 1738-7906, Vol. 11, No. 11, pp. 92-98.

Sathiyamoorthi, V. and. Murali Bhaskaran, V.(2012c) , “A Novel Approach For Web Caching
Through Modified Cache Replacement Algorithm”, International Journal of Engineering
Research and Industrial Applications”, ISSN: 0974-1518, Vol. 5, No. 1, pp. 241-254.

Sathiyamoorthi, V. and Murali Bhaskaran, V.(2009a) , “Data Preparation Techniques For Mining
World Wide Web Through Web Usage Mining-An Approach”- Review paper-International
23
Journal of Recent Trend in engineering, Volume 2 - No.4, pp. 1-4

Sathiyamoorthi, Dr. V. Murali Bhaskaran (2012d), “Optimizing the Web Cache Performance by
Clustering Based Pre-Fetching Technique Using Modified ART1”, International Journal of
Computer Application, ISSN: 0975-8887.

Sathiyamoorthi, V. and Murali Bhaskaran, V. (2009b), “Data Mining for Enterprise Resource
planning System”, International Journal of Recent Trend in engineering, Volume 3 - No.4, pp.1-
4.

Sathiyamoorthi, V. and Ramya, P.(2014) , Enhancing Proxy based Web Caching system using
Clustering based Pre-fetching with Machine Learning Technique, International Journal of
Research in Engineering and Technology(IJRET). eISSN: 2319-1163 | pISSN: 2321-7308.

Sathiyamoorthi, V. (2017), Improving the Performance of an Information Retrieval System


through WEB Mining, Print ISSN: 1312-2622; Online ISSN: 2367-5357 DOI: 10.1515/itc-2017-
0004.pp 28-34.

Senbagavalli M,Tholkappia A.G,(2014), A Competent approach for extracting and visualizing


web opinions using Clustering, International Journal of Advanced Research in Computer
Science and Software Engineering, Volume 4, Issue 6.

Silver D. (2016), ``Mastering the game of Go with deep neural networks and tree search,''
Nature, vol.529, pp. 484-489.

Woochul K. and Daeyeon K. (2018), DeepRT: A Predictable Deep Learning Inference


Framework for IoT Devices, 2018.

Xin X., Haibo H., Dongbin Z., Shiliang S., Lucian B and Simon X.Y.(2016), Machine Learning
with Applications to Autonomous Systems”, Hindawi Publishing Corporation.

Yang X., Zhao P., Zhang X., Lin J. and Yu W.(2017), Toward a Gaussian mixture model-based
detection scheme against data integrity attacks in the smart grid,'' IEEE Internet Things J., vol.
4, no. 1, pp.147_161.

24

You might also like