Professional Documents
Culture Documents
Rose, Research
and Thomas P. Karnowski
The University of Tennessee, USA
Frontier
M
I. Introduction ognition systems has shifted to the time [5] [6]. To that end, modeling the
imicking the efficiency and human-engineered feature extraction temporal component of the observations
robustness by which the human process, which at times can be challeng- plays a critical role in effective informa-
brain represents information ing and highly application-dependent tion representation. Capturing spa-
has been a core challenge in n arti- [2]. Mo
Moreover, if incomplete or tiotemporal dependencies, based on
ficial intelligence research for erro
erroneous features are ex - regularities in the observations, is there-
decades. Humans are ex - tra
tracted, the classification fore viewed as a fundamental goal for
posed to myriad of senso- p
process is inherently lim- deep learning systems.
ry data received every ited in performance. Assuming robust deep learning is
second of the day and are Recent neuroscience achieved, it would be possible to train
somehow able to capture findings have provided such a hierarchical network on a large set
critical aspects of this insight into the princi- of observations and later extract signals
data in a way that allows p governing informa-
ples from this network to a relatively simple
for its future use in a con- ti representation in the
tion classification engine for the purpose of
cise manner. Over 50 years rs mam
mammalian brain, leading to robust pattern recognition. Robustness
ago, Richard Bellman, who new ideas for designing sys- here refers to the ability to exhibit classi-
BRAND X
introduced dynamic programming i PICTURES h represent information.
tems that fication invariance to a diverse range of
theory and pioneered the field of opti- One of the key findings has been that transformations and distortions, including
mal control, asserted that high dimen- the neocortex, which is associated with noise, scale, rotation, various lighting
sionality of data is a fundamental hurdle many cognitive abilities, does not explic- conditions, displacement, etc.
in many science and engineering appli- itly pre-process sensory signals, but rath- This article provides an overview of
cations. The main difficulty that arises, er allows them to propagate through a the mainstream deep learning approach-
particularly in the context of pattern complex hierarchy [3] of modules that, es and research directions proposed over
classification applications, is that the over time, learn to represent observa- the past decade. It is important to
learning complexity grows exponen- tions based on the regularities they emphasize that each approach has
tially with linear increase in the dimen- exhibit [4]. This discovery motivated the strengths and weaknesses, depending on
sionality of the data. He coined this emergence of the subfield of deep the application and context in which it
phenomenon the curse of dimensional- machine learning, which focuses on is being used. Thus, this article presents a
ity [1]. The mainstream approach of computational models for information summary on the current state of the
overcoming the curse has been to representation that exhibit similar char- deep machine learning field and some
pre-process the data in a manner that acteristics to that of the neocortex. perspective into how it may evolve.
would reduce its dimensionality to that In addition to the spatial dimension- Convolutional Neural Networks
which can be effectively processed, for ality of real-life data, the temporal com- (CNNs) and Deep Belief Networks
example by a classification engine. This ponent also plays a key role. An observed (DBNs) (and their respective variations)
dimensionality reduction scheme is sequence of patterns often conveys a are focused on primarily because they
often referred to as feature extraction. meaning to the observer, whereby inde- are well established in the deep learning
As a result, it can be argued that the pendent fragments of this sequence field and show great promise for future
intelligence behind many pattern rec- would be hard to decipher in isolation. work. Section II introduces CNNs and
Meaning is often inferred from events or subsequently follows by details of DBNs
Digital Object Identifier 10.1109/MCI.2010.938364 observations that are received closely in in Section III. For an excellent further
This solution is not constrained to a lay- speech recognition and detection [12], ples, then projecting these similarity mea-
er-by-layer training procedure, making general object recognition [9], natural surements to lower-dimensional spaces.
it highly attractive for implementation language processing [24], and robotics. In addition, further inspiration and tech-
on parallel processing platforms. Nodes The reality of data proliferation and niques may be found from evolutionary
independently characterize patterns abundance of multimodal sensory infor- programming approaches [59, 60] where
through the use of a belief state con- mation is admittedly a challenge and a conceptually adaptive learning and core
struct, which is incrementally updated as recurring theme in many military as well architectural changes can be learned with
the hierarchy is presented with data. as civilian applications, such as sophisti- minimal engineering efforts.
This rule is comprised of two con- cated surveillance systems. Consequently, Some of the core questions that
structs: one representing how likely sys- interest in deep machine learning has necessitate immediate attention include:
tem states are for segments of the not been limited to academic research. how well does a particular scheme scale
observation, P(observation|state), and Recently, the Defense Advanced with respect to the dimensionality of the
another representing how likely state to Research Projects Agency (DARPA) has input (which in images can be in the
state transitions are given feedback from announced a research program exclu- millions)? What is an efficient frame-
above, P(subsequent state|state,feedback). sively focused on deep learning [29]. work for capturing both short and long-
The first construct is unsupervised and Several private organizations, including term temporal dependencies? How can
driven purely by observations, while the Numenta [30] and Binatix [31], have multimodal sensory information be most
second, modulating the first, embeds the focused their attention on commercial- naturally fused within a given architec-
dynamics in the pattern observations. izing deep learning technologies with tural framework? What are the correct
Incremental clustering is carefully applications to broad domains. attention mechanisms that can be used
applied to estimate the observation dis- to augment a given deep learning
tribution, while state transitions are esti- VI. The Road Ahead technology so as to improve robustness
mated based on frequency. It is argued Deep machine learning is an active area and invariance to distorted or missing
that the value of the scheme lies in its of research. There remains a great deal of data? How well do the various solutions
simplicity and repetitive structure, facili- work to be done in improving the learn- map to parallel processing platforms that
tating multi-modal representations and ing process, where current focus is on facilitate processing speedup?
straightforward training. lending fertile ideas from other areas of While deep learning has been suc-
Table 1 provides a brief comparison machine learning, specifically in the con- cessfully applied to challenging pattern
summary of the mainstream deep text of dimensionality reduction. One inference tasks, the goal of the field is far
machine learning approaches described example includes recent work on sparse beyond task-specific applications. This
in this paper. coding [57] where the inherent high scope may make the comparison of vari-
dimensionality of data is reduced through ous methodologies increasingly complex
V. Deep Learning Applications the use of compressed sensing theory, and will likely necessitate a collaborative
There have been several studies demon- allowing accurate representation of signals effort by the research community to
strating the effectiveness of deep learning with very small numbers of basis vectors. address. It should also be noted that,
methods in a variety of application Another example is semi-supervised despite the great prospect offered by
domains. In addition to the MNIST manifold learning [58] where the dimen- deep learning technologies, some
handwriting challenge [27], there are sionality of data is reduced by measuring domain-specific tasks may not be directly
applications in face detection [10] [51], the similarity between training data sam- improved by such schemes. An example