You are on page 1of 11

Neural Networks, Vol. 4, pp. 15-25, 1991 (JS93-608(I/91 $3.00 ~ .

00
Printed in the USA. All rights reserved. Copyright , 1991 Pergamon Press pie

ORIGINAL CONTRIBUTION

The Cortical Column: A New Processing Unit for


Multilayered Networks

FRt~Dt~RIC ALEXANDRE, FREDI~RIC GUYOT, AND JEAN-PAUL HATON


CR1N/INRIA

YVES BURNOD
Institut des Neurosciences, Paris V

(Received 22 August 1989: revised and ac'c~Tted 3 May 1990)

Abstract--We propose in this paper a new connectionist unit that matches a biological model o[ the cortical
column. The architectural and functional characteristics" o f this unit have been designed in the simplest manner
in order to simulate human-like reasoning, and to be as similar as possible to the main known ]batures of real
intracortical networks. We use a new type o f learning rule which can easily take into account goal-oriented
combinations o f actions in behavioral programs. These learning rules are both simple and biologically plausible.
We show in this paper that such units can be used in multilayered networks to perJbrm pattern recognition, with
[eedback connections effecting an attentive gating of sensory information flow. Computer simulations were
performed to assess the ability of a multilayered network made O[ these biologically inspired units to perform
standard speech and visual recognition. Such simulations show levels o f perJbrmance equivalent to the best
currently available connectionist networks .(or typical human-like problems, with very fast learning and recog-
nition processes. Furthermore, this type o f "cortical" unit can be used in more general multilayered networks
with units controlling different types o f external processing, in order to learn programs o f actions which may
be included in the process of recognition.

1. I N T R O D U C T I O N networks could correspond to the successive cortical


maps which process sensory information, with learn-
During the past ten years, both artificial intelligence
ing rules which enable them to learn any type of
researchers and neuroscientists have been paying in-
input-output transformations using an error signal
creasing attention to connectionist networks (Ru-
(back-propagation; Fogelman Soulie et al., 1989), or
melhart & McClelland, 1986). As a general rule, such
selective attention with top-down feedback control
networks are built with neuron-like interconnected
(neocognition, Fukushima, 1980: ART; Grossberg,
units, with connective efficiencies that change ac-
1988).
cording to locally available signals. The whole net-
Human processing abilities are mainly due to the
work learns to produce adaptive functions such as
cerebral cortex. Direct modeling of the neuronal cir-
recognition, with some similarities to certain brain
cuit in this complex structure is difficult: at the cel-
properties such as associative memories (Rumelhart
lular level, the cortical tissue includes different
& McClelland, 1986).
neuronal types, each with a defined pattern of con-
Connectionist models based upon mathematical
nectivity, and different input-output functions due
formalisms share common features with real neural
to a variety of transmitters and ionic channels. Con-
networks. For instance, both have associative prop-
sequently, models that attempt to simulate the cortex
erties and content-addressable memories (Minsky &
at the single-neuron level tend to become too com-
Papert, 1969; Kohonen, 1984), adaptive matching
plex to be implemented and mathematically con-
between external inputs, and internal "attractors" in
trolled.
recursive networks (Hopfield, 1982). Multilayered
The basic idea presented in this paper is to pro-
gressively shape a processing unit formed by a small
Requests for reprints should be sent to Yevs Burnod, De- group of neurons corresponding to a morphofunc-
partment of Neuroscience de la Vision, Institut des Neurosciences, tional substrate; in the cortex, this multicellular unit
Bat. C 6dine 6tage, 9, Quai St. Bernard, Paris 5dine, France, has been defined as the cortical column (Mountcas-

15
16 L Alexandre, E Guyot, J.-P. Hal(m, and K Burnoa

tie, 1978; Hubel & Wiesel, 1977: Szentagothai, 1975) Behavioral tasks, such as pattern recognition~ can
which is a repetitive circuit perpendicular to the cor- be conceived as involving several sets of laterally
tical surface. interconnected columns working in parallel (like cor-
We thus describe a new connectionist approach, tical maps), which effect serial processing on infor-
with a processing unit which does not correspond to mation originating from the cnvir~mment.
a neuron but to a cortical column (Burnod, 1988).
The main problem is to determine which mathe-
matical operations could model in a simple way the
2.1. Architecture
basic functions of such a multicellular unit (Shaw.
1988). Two sources of inspiration are relevanl. The cortex (and thus, the column) has a six-layered
The first one is neurobiological; the model has to organization. We, like others (Ballard, 1986), have
be biologically plausible: its architectural and func- chosen to group these six layers into three major
tional characteristics have been determined in the input-output divisions. The model unit thus has three
light of neurobiological considerations (see justifi- divisions in direct correspondance with the three ma-
cations in section 2). Even if the cortex is far from jor input-output divisions of tea! cortical columns
being understood, it is possible to use general prin- (Mountcastle, 1978; Jones, 1981):
ciples to define the main types of inputs, outputs,
and interactions. I. The pyramidal neurons of upper layers (layers
The second problem is cognitive; the processing 2 and 3) are specialized in cortico-cortical con-
unit combines essential features for human-like data nections that form direct or indirect connections
processing, but with parallel and distributed repre- between any two columns (Szentagothai, 1975;
sentations. Jones, 1981); the upper division of the unit will
Consequently, the proposed unit will be inter- provide a similar set of connections. These con-
preted both in terms of neuronal activity and in telms nections will be named internal input and internal
of data processing. output (see, for example, Figure 1).
In this paper, we attempt to model this basic cor--
tical function via a minimal set of rules, which never- , The intermediate layer (layer 4) receives the
theless perform more elaborate functions than main sensory inputs, from two different sources:
individual neurons.
either directly from the thalamus which provides
A formal network is developed, inspired from bi-
information from the "external world";
ological findings on the cerebral cortex (Burnod,
1988); this unit can be used to build large multilay- or from other cortical areas involved in earlier
ered networks for general artificial intelligence (AI) stages of sensory processing: connectivity upon
tasks such as pattern recognition (Alexandre et al., this division produces a progressive integration
1989) or reasoning. ol sensory information, by a divergence-con-
vergence resulting in a progressive increase in
the size of the receptive fields (Van Essen &
2. BIOLOGICAL INSPIRATION Maunsell, 1983; Zeki & Shipp, 1988).
The processing unit is designed to be in direct cor- The intermediate division of the model will also
respondance with the "cortical column," a stereo- have an external input which includes feed-for-
typed interneuronal circuit (about 100 /2m wide) ward connections.
whose main characteristics are repeated throughout
the cortex (Mountcastle, 1978; Szentagothai, 1975). . The pyramidal neurons of the lower layers (layer
These clusters of interconnected cells are not defined 5 and 6) are more specialized to effect "actions"
by anatomical boundaries, but rather from their ho- toward the external world or controls upo n the
mogeneous activities: when the firing frequency of information it produces (Zeki & Shipp, 1988;
cortical neurons is recorded in response to specific Van Essen & Maunsell, 1983):
stimuli (Mountcastle, 1978; Hubel & Wiesel, 1977),
or during stereotyped behaviour (Evarts & Tanji, they project to subcortical structures and com-
1974), homogeneous activities reveal sets of cells ar- mand different levels of actions and behavioural
ranged in vertical clusters or columns. It is thus pos- adaptations, and
sible to delimit groups of highly interconnected cells they participate in the feedback projection to
sharing the same set of inputs and outputs (Szenta- previous cortical areas and can thus effect se-
gothai, 1975; Jones, 1981). The lateral width of this lective control of this sensory information flow.
basic unit is determined by the functional homoge-
neity of component pyramidal neurons, in general The lower division of the unit will also provide
due to the intersection between subsets of connective an external output which also includes feedback
stripes (Hubel & Wiesel, 1977). connections.
The Cortical Column 17

Within each area, a column is connected from Wiesel, 1977), or linked with ongoing behavior
both upper and lower layers to neighboring columns (in associative areas; Mountcastle, Andersen, &
of the same "primary indice" (as defined in Ballard, Motter, 1981), or resulting in command of a
1986; e.g., in area 17, the same retinotopic position), movement (in motor areas; Evarts & Tanji,
and to more distant columns sharing similar func- 1974).
tional properties, with the same "secondary indice'"
(e.g., in area 17, the same orientation; Ballard, 1986; . Finally, we include a nul activity state (E0),
Gilbert & Wiesel, 1981). which describes the effects of inhibitory pro-
The three divisions of our unit will thus respect cesses.
major outlines of the cortical connectivity between
columns. Its connections can be used to produce a
2.3. Activation Rules
multilayered network (like a multilayered percep-
tron) but with feedback pathways (like recursive net- Activation rules of the unit will model input-output
works; e.g., Hopfield nets). As columns, the units transformations performed by a column using these
are not fully interconnected; a column is connected three activity levels; they are summarized in Figure
to a limited and rather constant number of cortical 1 which details activities of the two output divisions
columns (Mountcastle, 1978; Szentagothai, 1975). (internal and external for respectively upper and
The general organization of the columns in the lower layers) depending upon activity levels of the
cortex displays cytoarchitectonical variations, such two main inputs (internal and external, respectively
as the relative thicknesses of the six layers. These to upper and intermediate layers) and the previous
variations are introduced in the unit, which has sim- state of the column. Such activation rules will match
ilar parameters; for example, increasing the width of the following three important features of cortical
layer IV (thalamic input) in receptive areas is taken physiology.
into account by changing the weights of the external
input of the unit (see Figure 1). Conditional inhibitions. Strong lateral inhibitions
are observed in the cortex when produced by fo-
2.2. Levels of Activity cal, well-patterned stimuli although there are also
direct corticocortical excitatory connections
In the cortex it is necessary to consider different (Mountcastle, 1978; Hubel & Wiesel, 1977). In a
states of activity to describe with minimal complexity cortical column, inhibitory and excitatory inter-
several important features, such as selective atten- neurons are branched in parallel on the direct
tion, anticipation, or output actions which result in connections between cortical and thalamic inputs
reentrant feedback inputs. Considering all-or-none
activity, like in Hopfield nets (Hopfield, 1982), is
insufficient in the case of the cortex. In our unit, we
Inputs , prior Outputs
will distinguish three states of activity in correspond- InternallExtemal state IntemallExtemal Comments Depends upon:
ance with three cortical functional states (see Figure E0 E0 E0,EI E0 E0 Inactivity
1): E0 E0 Motor
E0 E2 E0,E1 El E0 Associative Map
E2 E2 Sensory
1. The low level, that we will call E1 in the unit, E0 E0 Inhibition
represents moderate neuronal activities (action El E0 E0,E 1 E1 E0 Gating Learning
potential frequencies in the range of 10 spikes/ E2 E2 Triggering
sec) which are insufficient to result in output El E2 E0,E 1 E2 E2 Amplification
actions: however, such activity, mostly intra- E0 E0
E2 E0,E2 E0,E1 Inhibition
E1 E0
cortical, can reflect active processes during se- E2 E0,E2 E2 E2 E2 Reactivation
lective attention, as seen for example in parietal
areas when stimuli do not match ongoing be- FIGURE 1. Activation rules for a simple model of the comical
haviour (Mountcastle, Andersen, & Motter, column. An in-out table imposes the state of activation of
1981), and also anticipation, as seen for example the two outputs (OF,, OEJ, in the right part) when the activities
in frontal areas during waiting for a go signal of internal and external (11/,, IEi) inputs, as well as the prior
(Fuster, 1977). state of the unit, are known (left part of the table). Three
activity levels are considered: E0, inhibition (or FALSE); El,
low level (or PERHAPS); E2, high level (or TRUE). The six
. A higher level of activation, which we term E2 lines of the table display the different possible combinations,
(action potential frequencies with a greater mag- and are ordered by the intensity of the internal inputs (see
nitude, 50-100 Hz), models activities observed text for detailed description). When more than one internal
during sensorimotor interactions with the exter- input is active, each potential level of activation is evaluated;
the more numerous is selected to be the real level of acti-
nal world, for stimuli matching intrinsic cortical vation. In case of equality, the unit is set to the uncertain
filtering functions (in receptive areas; Hubel & state El. After learning, this case tends to disappear.
18 E Alexandre, l-i Guyot, .]-E ttato:a and Y Bumod

onto pyramidal neurons (Szentagothai, 1975; properties and has no simple physiological correlates
Jones, 1981); the relation between columns is at the neuronal level.
controlled in parallel by excitatory and inhibitory It is possible, however, to define learning rules
interneurons, which tend to produce nonlinear which match the following features of the cortical
effects, with a spatiotemporal selectivity depend- network, both at the global and local levels.
ing upon branching patterns and channels of in-
terneurons. The combination performed by the The global logic of the learning rules corresponds
unit (line 5 of Figure 1) will be qualitatively dif- to operant conditioning: if a strongly active col-
ferent for different levels of activity; an increasing umn participates in an action outside the cortex,
level of activity can result initially in excitation and if this external action reactivates the inputs
(for a low level) followed by inhibition (for a of this column (equivalent to a " t~cward,'' or more
high level). Consequently, a strong activity level generally, to a "goal"). this input will gain b\.
will inhibit related units with a lower activity. learning a new influence: whet~ it has a low ac-
tivation (corresponding to :l 'drive"), it sclec.-
Gating and amplification. Moderate cortical in-
tively activates columns whose actions were
puts have a gating effect on other inputs, for ex-
previously efficient in satislying the drive state,
ample, on thalamic inputs (Connors, Gutnick, &
Prince, 1982; Asanuma, Waters, & Yumina,
Local learning rules correspond mostly to acti\-
1982). Consequently, in the unit (line 3 of Figure
ity-dependent plasticity of cortico--cortical path-
1), an internal input alone will have a weak in-
ways: these pathways are mediated by glutamate
fluence (at level El), but the coactivation of two
and involve receptors with potentiating proper-
inputs will generate a strong nonlinear increase
ties (Barrionuevo & Brown, 1983). This plasticity
of activity (E2).
occurs when a strong depolarization of the cell
Selective filtering oJ" inputs, As described in the (upper-pyramidal neurons) is followed by a
cortex, the operation upon the exernal and feed- strong reactivation of its inputs (due to glutamate
forward inputs will be modelled by a spatial fil- release). Long-term changes in the efficacy of this
tering process, with a central positive region and pathway will increase the influence of moderate
a peripheral negative one. This type of filtering cortico-cortical inputs (gating properties), but
is similar to the transfer function assumed to pro- will not influence the other combinations, p a r t >
duce orientation selectivity in visual areas from ularly with higher inputs.
thalamic inputs (Hubel & Wiesel, 1977).
Consequently, we will differentiate four stages of
Combining these rules results in a prediction: cor- learning that can be easily interpreted as activity-
tical activity will spread in the cortico-cortical net- dependent changes in the different types of cor-
work (via the upper layers) for a moderate activity, tical interneurons. The prediction made by con-
with no output outside the cortex. In contrast, a sidering this unit as a model of the cortical column
strong local pattern of activation will have a strong is that different types of interneurons can modify
inhibitory effect on less-activated columns and will their transmission efficiencies ~or specific acti-
produce a precise combination of output actions (via vation patterns which depend upon the specific
lower-pyramidal neurons). Such a model gives an input-output connectivities of each type of cell.
interpretation of neuronal activities within the cortex
The learning rules of the unit (Figure 2) sum-
and their behavioral correlates: a higher level of well-
marize the overall long-term activity-dependent
patterned activity corresponds to reaching a goal,
changes of neural transmissions within and be-
whereas lower activity levels correspond to a search-
tween columns, mainly by their global effects oll
ing process (propagation of activity) or an attentive
the input-output roles. We do not take into ac-
state.
count synaptic weights from ~,euron to neuron,
but transmission coefficients between input and
2.4. Learning Rules output divisions of the units. Figure 2 (in the
Local learning rules, such as the Hebb rule or the right-hand side) shows how a moderate internal
delta rule, enable neural nets to learn global in-out input (from a unit A, "presynaptic") can produce
functions; mathematical generalization of these rules four different effects on the target unit (B, which
for multilayered networks was provided with back- is the learning unit), depending upon previous
propagation algorithms. Such algorithms have been patterns of activities. Critical features for learning
used to model the transformations performed by (left part of Figure 3) are time coupling and states
cortical areas (Zipser & Andersen, 1988). But of activity, measured by two "repetition factors"
generalization of learning rules such as the back- called P2 and P 0 : P 2 represents the probability
propagation algorithm is based upon mathematical that the learning unit B is active (in state E2)
The Cortical Column 19

Outputs of i when: pathway between cortico-cortical inputs and


Repetition External input = E0 lower-layer output cells. Stabilization of gat-
Factor Internal input = E1 ing could come from the competitive changes
POl Internal I External in excitatory and inhibitory vertically ori-
E0,E1 E0 Random ented interneurons which can control the
1 0 E0 EO Inhibition coupling between upper and lower layers of
<1 >0 E1 E0 the column.
Gating
0 1 E2 E2 Triggering . Triggering: If the unit is always strongly ac-
tive before a strong input (P2 = 1), this in-
FIGURE 2. Learning rules. This table shows the functional put, when moderate, will gain a very strong
consequences (right part of the table) of the long-term triggering effect that will result in a strong
changes of transmission produced by two patterns of activ-
ities (repetition factor, in the left part of the table): P0 j, is the output. The behavioral interpretation is that
conditional probability that the unit i receives a strong in- possible goals can directly trigger actions
ternal input j when it was previously inhibited; P21 is the which were always effective. This learning
conditional probability that a strong internal activation (E2) stage corresponds to the connective logic of
occurs when the unit i is already strongly active (E2). The vertically oriented disinhibitory interneu-
four lines describe the four learning stages and the four
possible resulting states of connections (detailed in the text). rons.

The learning logic of this model is somewhat dif-


ferent from other connectionist models and gives
before a strong input (state E2) from unit A;
a possible interpretation of the activity in the ce-
conversely, P0 measures the inactive state of the
rebral cortex.
learning unit B, before similar strong input from
A. Consequences of learning (in the right-hand
Within this framework, every strong cortical
side of Figure 2) are only visible upon moderate
activation of a cortical column can be viewed as a
inputs (El). The four learning cases can be in-
logical equilibrium state (a goal), due to lateral
terpreted as follows:
inhibition; for example, recognition corresponds to
1. Random: Before learning, a moderate input well-patterned activity (E2 or E0) in a cortical re-
has weak and random effects. gion. Conversely, a moderate lcvel of activity which
persists represents possible goals and thus an atten-
2. Inhibition: If the unit is always inactive before
tive state before recognition (a specific subset of
a strong activation of a cortical input (P0 =
competitive stable states are possible); in this case,
1), this input will gain by learning an inhib-
activation rules result in a continuous search (or wait-
itory anticipatory effect. Behavioral interpre-
ing) for cortical actions, adapted to the environment
tation of this rule is an increased inhibitory
(match between cortical and thalamic inputs), which
influence of possible goals toward actions
can result in attaining an equilibrium state, defining
that happened to be always inhibited before
for example a full recognition (E2-E0). Further-
a success. The prediction for cortical physi-
more, the learning rules enable intracortical connec-
ology is that inhibitory interneurons can be
tions to learn context-dependent combinations of
modified like pyramidal neurons: when they
cortical actions which are efficient in reaching this
are first strongly active (no action is made)
equilibrium.
and then when they are reactivated by cor-
At the local level, activity-dependent changc~ oc-
tico-cortical inputs (success).
cur for in-out combinations which do not have the
3. Gating: When the unit is sometimes strongly same temporal sequence as Hebb rules: the cell is
active (and sometimes inactive) before a first depolarized and then reactivated by its inputs.
strong cortical input, there will be an increase Such temporal sequences are more compatible with
in the gating effect of this input, when mod- the known physiology of glutamate receptors. This
erate; however, another coactive input will type of learning is not competitive with other mech-
still be needed to produce a strong output. anisms closer to Hebb rules which could provide a
The behavioral interpretation is an increase more sensitive adjustment between inputs and out-
of the gating influence of possible goals to- puts.
ward units which sometimes were efficient in
reaching such goals. Prediction is an in-
3. I N F O R M A T I O N - P R O C E S S I N G
creased efficacy of the excitatory neurons,
MODELIZATION
mainly upper-pyramidal neurons, as de-
scribed in the preceding paragraph; such neu- The processing unit that we propose has been inter-
rons are branched in parallel upon the direct preted in terms of neuronal activity; we give now a
20 E Alexandre, E Guyot, J.-P. Haton, and Y. Burnod

complementary interpretation in terms of data pro- Oli(t), OEi(t), where j and k represent the connected
cessing. units and t stands for the time.
We propose a realization with six main features, This distinction corresponds to the upper layers
sufficient to specify a connectionist network (Ru- of the cortex for internal relations, to the lower layers
melhart & McCielland, 1986). A description and a for external outputs, and to the intermediate one for
data-processing interpretation are given for each external inputs.
characteristic.

3.1. Architecture 3.2. Connectivity


In human-like functions, at least three types of in- The network is not fully interconnected; a unit is
formation can be distinguished: stimuli, actions, and connected to a limited and rather constant number
concepts. They are separately expressed in the net- of other units in four linking ways that take into
work: account the two types of input-output, as illustrated
in Figure 3:
• stimuli come from the external world via specific
receptors, • A unit has nl internal input-output connections
with neighboring units in the same map.
• external outputs toward the external world will
trigger specific actions, • n2 internal input-output connections with other
maps,
concepts are represented by distributed activities
in the network and interact by internal input- • each unit has an external output which is either
output relations. a feedback or a response outside the network.

Inputs and outputs of a processing unit i are thus the external input is either a stimulus or a feed-
divided into an internal component (for concepts) forward input from other maps. These external
and an external one (for stimuli and actions), as connections (both input and output) are orga-
shown in Figure 1; they are denoted by II~ (t), IE~ (t), nized in continuous overlapping receptive fields.

ssA

Connections
Internal Input/output intra-areal

k Internal input/output inter-areal


Rt
] External input (input mask)

( External output

FIGURE 3. Organization and connectivity of the processing levers (model.r.: cortical a r e a s ) ~ in the simulation. This
network tot visual ~ i has ~
arrows show feed-forward and ~ internal
sensory areas. Transfer functions of milts in the ~
in o w r ~ ~ ~ ~ by~ ~ l . ~
units in associative area, both extra- and Intlll4reel.
The Cortical Column 21

The internal neighboring connections can be re- Finally, it is necessary to define a rule to combine
garded as potential pathways between concepts several synchronous internal inputs at time t for a
where information propagates step by step. unit i.
If the event "internal output j of a unit i at time
t in state Ex" is denoted by OI ~,x(t), and the "number
3.3. Activity Levels
of event Ollx(t) f o r j = 1 t o n l + n2" is denoted
A three-valued logic is a minimal modelization of by nx, then OIi(t) = Ex if max(n0, nl, n2) = nx.
human-like processing, in order to take into account In case of equality, the unit is set to El, uncertain
anticipatory and attentive behaviours. Moreover, the state,
three values directly correspond to the three states The global external output is computed in the
of activity described in the neurobiological model same way.
(EO, E l , E2): This truth table (Figure 1) is built from the neu-
robiological data and is a model of neuronal inter-
E2 stands for the " T R U E " value: depending actions in a cortical column. Moreover, in the light
upon its localization, it can be interpreted as a of previous interpretations, the table corresponds to
"sure" stimulus, action, or concept. a logical data-processing mechanism. Only the cases
which modify the outputs of the unit are taken into
E0 stands for the " F A L S E " value: in the neu- account.
robiological model, it is often due to an inhibi-
tion. Line 5 illustrates the inhibition mechanism, that
selectively limits the number of active units. Val-
E1 stands for " P E R H A P S " : it can be interpreted idation of a hypothesis (decision) results in a
as a hypothesis. Generally, when a hypothesis is suppression of competitive hypothesis.
emitted, the system will search to validate it; tran-
sition from E1 to E2 stands for the validation of By contrast, in line 6, two different decisions,
such a hypothesis, that is a decision. when successive, are not competitive; further-
more, this temporal relation will result in learn-
The intermediate state of activity can propagate in ing.
the network; this propagation allows a limited num-
ber of connections between units. The communica- Line 3 is learning dependent, according to the
tion between distant units can be performed by coefficients P0{ and P2~ (see "learning rules").
spread of activity from one unit to the next. Prop- When a hypothesis is emitted, its influence de-
agation of state E1 is thus interpreted as propagation pends upon previous experiences as described in
of hypothesis, that is, an active search. the learning rules section. It can suppress other
possibilities (E0), emit a new hypothesis (El) or
trigger a decision (E2).
3.4. Activation Rules
The unit has to compute its outputs (internal and • Line 4 shows how a hypothesis is validated when
external) as a function of its inputs, of its prior state, it matches an external stimulus.
and of previous learning. The two following functions
have to be defined: In line 2, three different kinds of topologic maps
are distinguished in the network, each kind de-
OL(t) = f(IIi(t),IE~(t ), OE,(t - 1),P0J,P2 ', ), voted to a specific step in the information anal-
ysis. A stimulus will imply a certitude in "sensory
OE,(t) = g(IF(t), IE~(t), OE,(t - 1),P0',P2',),
maps," a hypothesis in "associative maps," but
where j =- 1. . . . nl + n2; k indexes the external cannot directly trigger an action in "motor maps."
receptive field and P~ measure previous learning, as
described below.
3.5. Learning Rules
At time t, for each unit i, a global external input,
denoted by IEi, is computed as a function of the Like activation rules, learning rules are both consis-
IE~(t). It measures the correspondance between the tent with neurobiological data (see Section 2) and
input mask of a unit i, defined a priori without learn- with a data-processing interpretation. As in stan-
ing, and its external inputs IE~; IEi is set either to dard neural nets, learning is locally computed. Figure
E2 (exact correspondance) or E0 (too much differ- 2 shows how an internal input j (from a unit Af ) can
ence). produce four different effects on the target unit A,
We then compute local outputs (OE~, OI~) for (right part of the table), depending upon the prob-
each internal input II~ modulated by IEi thanks to ability of their coactivation expressed by the repe-
the truth table shown in Figure 1. tition factors P21 and P0 I. If N(e) measures the
22 k Alexandre, F Guyot, .f.-P. Hattm. and E Burnod

number of occurrences of the event e from tO up to quency ranges from 0 to 8000 Hz. and from 10
current time, then to 50 frames allocated to the word.
p0~ = N(IU2(t).and. OI,0(t - d t ) ) Connections between initial encoding and SMI
N(II~2(t)) are unidirectional and retinotopically (or tono-
and topically) organized. In SMI, units respond to the
input, selectively oriented (Hubel & Wiesel,
p21 = N(IU2(t).and. O1,2(t - dt)) 1977) into local 3 × 3 overlapping receptive
N(II ~,2(t)) fields, as illustrated in Figure 3, Similarly, in the
auditory case. units are specialized for time and
where dt stands for a little delay.
frequency range.
The four cases may be interpreted as follow:
As illustrated in Figure 3, each unit in SMI re-
• Before learning, propagation of hypothesis is ran-
ceives its "'external" 3 x 3 inputs from a field of
dom,
SMI (feed-forward), sends back its actions to
SMI, exchanges symetrically its internal inputs
During learning, if a concept (Aj in state E0)
and outputs with its neighbors in SMII, and is
never participates to the validation of another
connected with a group of adjacent units in AM.
concept (A i, E2), the latter will gain an antici-
patory inhibitory effect (P0',I = 1).
In the associative map AM. each unit carries out
a coactivation test between two neighborhoods
But if the concept Ag is sometimes linked with
taken from SMII and SM, and has reciprocal links
the validation of the concept A/, learning will
with adjacent units in AM (internal relation).
result in a selective propagation of hypothesis A,
toward A~ (0 < P0{ < 1, 0 < P2', < 1).
The map SM contains either a set of 26 units for
visual recognition of letters (see Figure 3) or l0
Finally, if the coactivation pattern is always ob-
units for number recognition in speech process-
served, a hypothesis from Aj will directly trigger
ing; these units are reciprocally connected and
a decision of A~, even in absence of external con-
receive inputs from groups of adjacent units in
firmation (P2~ = 1).
AM.

4. THE COMPUTATIONAL NETWORK


4.1. Processing
This theory, consisent with neurobiological data,
provides autonomous basic units, each computing an Initially, every coefficient in the network is unde-
output as a function of its inputs, modulated by its termined and requires no a priori choice. In a typical
internal features which can change with learning. An learning session, a given pattern (a character or a
implementation consists of building a network of spectrogram) is presented to the receptor and all the
units and simulating the inherent parallel processing. units in SM are inhibited lEO), except the one cor-
The network was used to recognize alphabetical responding to the solution which is forced to E2 (sta-
patterns (vision) and isolated words (speech recog- ble state). Information propagates and interferes in
nition). These two sensory functions require the the whole network, from these two external inputs,
same global architecture shown in Figure 3, sum- according to the connectivity scheme and the func-
marized by a signification map (SM) where activity tioning rules. Eight evaluations are necessary for the
represents the state of recognition, a sensory map I network to come back to a stable state. During the
(SMI), a sensory map II (SMII) and an associative learning phase, each unit reaching the stable state
map (AM) (all chosen to coincide with cortical areas (E2) brings its relations with other connected units
which participate in a specific behavioural or cog- up to date (coefficients P0 and P2), following the
nitive task). learning rules. After learning, these coefficients de-
Since each unit behaves autonomously, the net- limit new functional sets of units which correspond
work is efficiently specified by the connectivity pat- to invariant characteristics of the external world:
tern of each map: During the recognition process, an unknown pat-
tern is presented to the receptor and all the units
The only disparity between auditory and visual in SM are set to E1 (unstable state, desired goal),
networks involves the initial encoding: a 8 × 8 in order to trigger parallel analysis. The network
grid where the letters are digitalized for visual evolves until one unit in SM reaches the stable state
recognition, and for auditory recognition, a 32 x E2 (reached goal, solution). The middle set of units
50 grid which receives spectrograms over 32 fre- AM allows association, through its connectivity, be-
The Cortical Column 23

tween the invariant characteristics in the two sen- lOO%


sorimotor fields (visual or audio and significative).
75%
AM changes the activity distribution in SMI and
SMll. From an unstable state ( E l ) in all units of SM, 50%
the network converges toward a learned distribution
(pattern) in SMI and SMII, and a stable state in SM, 25%
that is a subset of highly active units (E2), inhibiting Number of invertedpixels
i i i i
(E0) all the other units in this area.
5 l0 15 20
(a)
4.2. Character Recognition 100 %

For the first session, thc learning corpus is consti- 75%


tuted of 26 upper-case characters, binary digitized
on a 8 × 8 grid. During learning, each letter is pre- 50% J
learned speakers
sented twice to the network. As described in previous others
studies (Fogelman et al., 1989), the test corpus has 25%

been built up from the initial letters by randomly Number of speakers


i

inverting n pixels (n varying from 1 to 25) in each 5 10 15 20


image. For the second session, the network trains (b)
with noisy letters, each being presented with 6, 9,
FIGURE 4. Visual and auditory performances of the network.
and 12 randomly inverted pixels (Fogelman Soulie
Figure 4a- The visual case (upper diagram). This figure shows
et al., 1989). The recognition rate is reported in Fig- the recognition rate as a function of noise level. The network
ure 4a as a function of noise. The performances are was trained on the whole alphabet, with or without noise,
rather close, whatever the learning corpus. The slight and gave very close performances for the recognition of
difference between the two graphs in Figure 4a re- noisy letters. Let us mention here that the system needs only
10 sec to learn the whole alphabet. Figure 4b: The auditory
veals the innate and implicit generalization abilities
case (lower diagram). This shows the recognition rate as a
of the network. Let us mention here that during noisy function of the number of speakers in the learning set. In
character recognition, most of the errors were also both cases, the network was trained on the 10 numbers and
committed by the human subject, owing to the ratio was tested either with the learned speakers, or with unknown
between the number of randomly inverted pixels and speakers. If only one speaker is learned, the network rec-
ognizes him or her perfectly but often fails (50%) with another
the number of pixels differentiating one letter from
speaker. When more speakers are learned, the performances
an other (confusion between a noisy E and F, or a slightly decrease for the learned speakers (85% for 20
noisvOandO...) learned speakers), as the number of tokens correctly labeled
for unknown people increases up to a similar rate (80%), and
toward a supposed common limit for the two curves,
4.3. Speech Recognition
The learning and the test sets were prepared as fol- (4 × 10). The network always recognized the learned
lows. Thirty speakers recorded four versions of the tokens, and made an average of 3% errors for the
10 numbers. The recordings were made with no par- test tokens (Elman & Zipser, 1988).
ticular precaution (background noise, no starting For a speaker-independent trial, the network was
synchronization, no constant rate of s p e e c h . . . ) . A trained on the 40 spectrograms from 2 to 2(1 people.
Fast-Fourier transform (FFT) was carried out over The network's responses were tested either with
8-ms frames. As a result, each token was represented learned subjects or with unknown subjects (Peeling,
as a set of 1(t-50 frames each containing 32 positive Moore, & Tomlinson, 1986). The recognition rate
values representing the 32 linear frequency ranges for subjects included in the learned set is reported
from 0 to 8000 Hz. In order to reduce the variation in Figure 4b. It starts from a maximum rate for one
of the number of frames for a same number (begin- speaker, and becomes stable to an average rate of
ning of each token not located, pronounciation 85% for many speakers. Regarding the recognition
duration . . . ) , a temporal compression of the rate for unknown people, it grows from 50% for one
spectrogram eliminates every frame too close to the learned speaker, up to 80% when the learning set
preceding one and then transforms the 32 linear fre- contains many speakers. An unknown speaker may
quency ranges into 16 physiological f r e q u e n c y have all versions of a number refused, because of his
ranges. This array of (16 × n) values (n varying from particular pronunciation. Isolated errors are often
10 to 25 after compression) is proposed as an initial due to confusion between two fricatives or two vow-
encoding to sensory map I. els.
For a speaker-dependent trial, the learning was Figure 5 illustrates the internal representation of
done only once for each of the 40 spectrograms learning through probabilistic coefficients.
24 [ Alexandre, E Guyot. J.-P. tfat<;n, and E Burnod

g
UJ

N'H"
:+: :+:÷:
. '5:::::'+::
.'...

': iiiiii i
I
: !

,~ , , + :!'!:!:i? , ..... ~T.~I~_~__.~_,.~. ' ....


lIME TIME lIME ~IME
M~NIAE REPHESENIAIION 0F A 2 MENIAL REPRESEN~AI|0N 0F A SONOGIRAM OF A 3 RECOGNIZED SONOGRAM DF A 3 RECOGNIZED

Pr ubab i ba'~pI

FIGURE 5. Internal and external representation. This figure represents configurations in the network for recognition of isolated
words, in order to Illustrate the basis on which the matching operation is effected. The two pictures on the left show the
memorized coefficients between units of AA and two different units of SA recognizing a "2" (first picture) or s "3" (second
picture). This "mental representoUon" is obtained after learning 20 x 4 occurrences of the digits. By contrast, the two pictures
on the right ~ the Initial o n c ~ s for two spectrograms, both corresponding to the digit "3," and produced by two
different speakers (note their disparity). These occurrences: are recognized by the stored coefficients of the mental represen-
tation of "3" (second picture).

4.4. Spatiotemporai Complexity demonstrates levels of performance comparable to


the best available algorithms; moreover, learning
For vision, the network has 1200 units with 6 con-
nections per unit; for speech recognition the net- speed is much faster than in algorithms used in other
multilayered networks (for example, back-propaga-
work's size was increased to 4000 units because of
the bigger size of the input information flow; the tion). This work illustrates the fact that it is possible
number of connections per unit is identical. The sim- to elaborate models integrating three basic proper-
ulations reported here were run on a Sun 3/50. The ties: (i) the basic rules and the construction of thc
network are physiologically plausible; (ii) the rules
learning is very fast, so the network needs 15 s to
are simple (in-out tables); and (iii) the network per-
learn the 40 spectrograms of one speaker, and 10 s
forms rapid pattern recognition with an efficient
to learn the 26 letters. Recognition requires in both
learning as shown here for both visual and auditory
cases about 0.5 s.
inputs. A promising direction is the capacity of such
models to result in sophisticated behaviors (Burnod,
5. CONCLUSION 1988), as we shall attempt to illustrate in on-going
and future work. This includes (i) construction of
Several simulations of the processing effected by the
functional networks with goals and subgoals to gen-
cortical areas have been proposed using multilayered
erate planning, (ii) progressive learning of symbolic
neuronal networks and back-propagation algorithms
reasoning, (iii) combination of networks which in-
(Zipser & Andersen, 1988; Lehky & Sejnowsky,
clude several information pathways with specific ex-
1988). But the principles of these algorithms are far
ternal inputs and outputs for the different sensory-
from biological mechanisms. Our approach is quite
motor fields (Guyot, Alexandre, & Haton, 1989),
different: we propose a general processing unit which
and (iv) integration of time (Guyot et al., 1989).
is based upon neuronal properties in the cerebral
cortex and fits experimental data; furthermore, this
unit is common to all cortical areas and could be
used for a variety of adaptative and cognitive func- REFERENCES
tions. Simulations show the ability of the cortical Alexandre, F., Burnod, Y., Guyot, F., & Haton, ,L P. (1989). La
column to effect associative recognition, which is an colonne corticale, unit6 de base pour des r6seau× multi-
adaptative function of biological systems commonly couches. Cornpte Rendu a l'AcadOnie des Sciences, Paris,
implemented by artificial neural networks. Figure 4 309(III), 259-264.
The Cortical Column 25

Asanuma, H., Waters, R. S., & Yumina, H. (1982). Physiological Hopfield, J. J. (1982). Neural network and physical systems with
properties of neurons projecting from area 3a to area 4 of emergent collective computational abilities. Proceedings" of the
feline cerebral cortex. Journal of Neurophysiology, 48(4), National Academy of Sciences, 79, 2554-2558.
1(148-1057. Hubel, D. H., & Wiesel, T. N. (1977). Functional architectures
Ballard, D. H. (1986). Cortical Connections and parallel pro- of macaque monkey visual cortex, Ferrier Lecture. Proceed-
cessing: Structure and function. The Behavioral and Brain Sci- ings of the Royal Society of London B, 198, 1-59.
ences, 9, 67-120. Jones, E. G. (1981). Identification and classification of intrinsic
Barrionuevo. G., & Brown, T. H. (1983). Associative long-term circuit elements in thc neocortex. In G. M. Edelman, E. Gall,
potentiation in hippocampal slices. Proceedings of the National & W. M. Cowan (Eds), Dynamic as'peers (?[neocortical.[unc-
Academy ~/ Sciences, Neurobiology, 80, 7347-7351. tions. New York: John Wiley & Sons.
Burnod, Y. (1988). An adaptive neural network: The cerebral cor- Kohonen, T. (1984). Self-organization and associative memoo,.
tex. Paris: Masson. New York: Springer-Verlag.
Connors, B. W., Gutnick, M. J., & Prince, D. A. (1982). Elec- Lehky, S. R., & Sejnowsky, T. J. (1988). Network model of shape-
trophysiological properties of neocortical neurons in vitro. from-shading: Neural function arises from both receptive and
Journal ()f Neurophysiology, 48(6), 1302-1320. projective fields. Nature, 333, 452-455.
Elman, J. L., & Zipser, D. (1988). Learning the hidden structure Minsky, M., & Papert, S. (1969). Perc~7)trons. Cambridge MA:
of speech. Journal ~/" Acoustical Socie O, ~[ America, 83(4), MIT Press.
1615 1626. Mountcastle, V. B. (1978). An organizing principle ]?)r cerebral
Ew~rts, E. V., & Tanji, J. (1974). Gating of motor cortex reflexes Junction. The unit module and the dis'tributed svstem, The mind-
by prior instruction. Brain Research, 71,479-494, [id brain. Cambridge, MA: MIT Press.
Fogelman Soulie, F., Gallinari, P., Le Cun, Y., & Tbiria, S. (1989). Mountcastle, V. B., Andersen. R. A., & Motter, 13. C. (1981).
Network learning. In Y. Kodratoff & R. Michalski (Eds.), The influence of attentive fixation upon the excitability of the
Machine learning (Vol. 3). San Mateo, CA: Morgan Kauf- light-sensitive neurons of the posterior parietal cortex. Journal
mann. ~f Neurosciences, 1, 1218-1235.
Fukushima, K. (1980). Neoeognitron: A self-organizing neural Peeling, S. M.. Moore, R. K., & Tomlinson, M. J. (1986). The
network for a mechanism of pattern recognition unaffected by multi-layer perceptron as a tool for speech pattern processing
shift in position. Biological Cybernetics, 36, 193-202. research. In Proceedings of the IOA Autumn ConJoren¢es" on
Fuster. J. M. (1977). Unit activity in the prefrontal cortex during Speech and Hearing. London: Controller HMSO.
dclaved response performance: Neuronal correlates of short- Rumelhart, D. E., & McClelland, J. L. (1986). Parallel dis'tributed
term memory. Journal ()/"Neurophysiology, 36.61-78. processing. Cambridge, MA: MIT Press.
Gilbert, C. D., & Wiescl, T. N, (1981). Laminar specialization Shaw, G. L., Silverman, F. J.. & Pearson, J. C. (1988). Trion
and intracortical connections in cat primary visual cortex. In model of cortical organization and the search of the code of
F. Schmitt et al. (Eds.), The organization of the cerebral cortex. short=term memory and of information processing. In J. De-
Cambridge, MA: MIT Press. lacour & J. C. S. Levy (Eds.), Systems with h'arning and mem-
Grossberg, S. (1988). Nonlinear neural networks: Princi- ory abilities. Elsevier, North-Holland.
ples, mechanisms, and architectures. Neural Networks, 1, Szentagothai, J. (1975). The "module concept" in cerebral cortex
I7-61. architecture. Brain Research, 95, 475-496.
Guyot, F., Alexandre, F., & Haton, J. P. (1989). Toward a con- Van Essen, D. C., & MaunselL J, H R. (1983). Hierarchical
tinuous model of the cortical column: Application to speech organization and functional streams in the visual cortex. 7?ends
recognition. In Proceedings' (~[ the International Congress (~/" in Neurosciences, 6, 9,370-375.
Acoustic and Speech Processing, Glasgow. New York: IEEE. Zeki, S., & Shipp, S. (1988). The functional logic of cortical
Guyot, F., Alexandre F., Haton, J. P., & Burnod, Y. (1989). A connections. Nature 335, 311-316.
potentially powerful connectionist unit: The cortical column. Zipser, D., & Andersen, R. A. (1988). A back-propagation pro-
NATO Advanced Research Workshop on Neuro Computing, grammed nework that simulates responses properties of a sub-
Les Arcs. Heidelberg: Springer-Verlag. set of posterior parietal neurons. Nature 331,679-684,

You might also like