You are on page 1of 8

Motor Control Models: Learning and Performance

Pietro G Morasso, Italian Institute of Technology, Genoa, Italy


Ó 2015 Elsevier Ltd. All rights reserved.

Abstract

The focus of the article is on the variety of attempts that have been investigated for capturing the complexity of purposive
action and adaptive behavior, having defined a coordinated action as a class of movements plus a goal. Redundancy is a side
effect of this connection, and thus redundancy is necessarily task oriented, something to be managed ‘online’ and rapidly
updated as the action unfolds. The article then analyzes two main computational mechanisms that have been proposed as
candidates of how the brain may deal with motor redundancy: (1) the force-field-based solution known as Equilibrium-Point
Hypothesis (EPH) and (2) the cost-function-based solution to the degrees of freedom problem, namely Optimal Control
Theory. However, both theories apply only to overt actions where force fields and cost functions are directly related to the
interaction of the body with the physical world in the course of a real action. Considering that overt actions are just the tip of
the iceberg, hiding the vast domain of covert actions that are the skeleton of motor cognition, an extension of EPH is
described, Passive Motion Paradigm (PMP). The relationships between PMP, the simulation theory of covert actions, internal
models, and the body schema concept are also analyzed. Finally, general learning mechanisms that may support the
acquisition of internal computational modules are briefly summarized.

Modeling the way in which humans learn to coordinate their That experience, or percept, is the joint product of the stimu-
movements in daily life or in more demanding activities is an lation and of the process itself, particularly in the perception
important scientific topic from many points of view, such as the and representation of space. An early theory of space percep-
medical, psychological, kinesiological, and cybernetical. The tion put forth by the Anglican bishop G. Berkeley at the
article analyzes the complexity of this problem and reviews the beginning of the eighteenth century was that the third dimen-
variety of experimental and theoretical techniques that have sion (depth) cannot be directly perceived in a visual way,
been developed for this purpose. because the retinal image of any object is two-dimensional, as
With the advent of technical means for capturing motion in a painting. He held that the ability to have visual experiences
sequences and the pioneering work of Marey (1894) and of depth is not inborn but can only result from logical
Muybridge (1957) in this area, the attempt at describing, deduction based on empirical learning through the use of other
modeling, and understanding the organization of movement senses.
has become a scientific topic. The fact that human The first part of the reasoning (the need of a symbolic
movements are part of everyday life paradoxically hides their deductive system for compensating the fallacy of the senses) is
intrinsic complexity and justifies initial expectations that clearly wrong, and the roots of such misconception can be
complete knowledge could be achieved simply by improving traced back to the neoplatonic ideas of the Italian Renaissance
the measurement techniques and carrying out a few carefully in general, and to Alberti’s window metaphor in particular.
designed experiments. Unfortunately, this is not the case. Also, the Cartesian dualism between body and mind is just
Each experiment is frequently the source of more questions another face of the same attitude and such Descartes’ error, to
than answers, and thus the attempt to capture the quote Damasio (1994), is on a par with the Berkeley’s error
complexity of purposive action and adaptive behavior, after described in the preceding paragraph, and is at the basis of
a century of extensive multidisciplinary research, is far from the intellectualistic effort to explain the computational
over. complexity of perception that characterizes a great part of
The conventional view is based on a separation of percep- the classic artificial intelligence approach. However, the latter
tion, movement, and cognition and the segregation of part of Berkeley’s conjecture (the emphasis on learning
perceptual, motor, and cognitive processes in different parts of and intersensory integration) is surprisingly modern and
the brain, according to some kind of hierarchical organization. agrees, on one hand, with the modern approach to
This view is rooted in the empirical findings of neurologists of neuropsychological development pioneered by Piaget (1963),
the nineteenth century, such as J. Hughlings Jackson, and has and on another hand, with the so-called connectionist point of
a surprising degree of analogy with the basic structure of view, originated in the 1980s as a computational alternative to
a modern PC that typically consists of input and output classic artificial intelligence.
peripherals connected to a central processor. Perhaps the An emergent idea is also the motor theory of perception,
analogy with modern technology justifies why this old- well illustrated by Berthoz (1997); that is, the concept that
fashioned attitude still has its supporters, in spite of the perception is not a passive mechanism for receiving and
massive empirical and conceptual challenge to this view and its interpreting sensory data but is the active process of
inability to explain the range of skills and adaptive behaviors anticipating the sensory consequences of an action and
that characterize biological organisms. thereby binding the sensory and motor patterns in a coherent
Let us consider perception, which is the process whereby framework. In computational terms, this implies the existence
sensory stimulation is translated into organized experience. in the brain of some kind of ‘internal model,’ as a bridge

International Encyclopedia of the Social & Behavioral Sciences, 2nd edition, Volume 15 http://dx.doi.org/10.1016/B978-0-08-097086-8.43068-0 957
958 Motor Control Models: Learning and Performance

between action and perception. As a matter of fact, the idea that from the seminal work on cybernetics initiated by Norbert
the instructions generated by the brain for controlling Wiener (1948). Their level of influence in brain theory is
a movement are utilized by the brain for interpreting the certainly determined by the tremendous success of these
sensory consequences of the movement is already present in techniques in the modern technological world. However,
the pioneering work of Helmholtz and von Uexküll, and its their direct applicability to what we may call the biological
influence has resurfaced in the context of recent control hardware is questionable for three main reasons: (1) feedback
models based on learning (e.g., Wolpert and Kawato, 1998). control can only be effective and stable if the feedback delays
The generally used term is ‘corollary discharge’ (von Holst are negligible, but this is not the case for biological feedback
and Mittelstaedt, 1950), and implies an internal comparison signals, where transduction and transmission delays add up
between an outgoing signal (the efferent copy) and the to tens of milliseconds; (2) the concept of motor program
corresponding sensory re-afference: the coherence of the implies a sequential organization, which only can be effective
two representations is the basis for the stability of if the individual steps in the sequence are sufficiently fast,
our sensorimotor world. This kind of circularity and and this is in contrast with the parallel, distributed processing
complementarity between sensory and motor patterns is of the brain made necessary by the relative slowness of
obviously incompatible with the conventional reasoning synaptic processing; and (3) the degrees of freedom (DoF)
based on hierarchical structures. A similar kind of circularity problem, further analyzed in the following section, implies
is also implicit in Piaget’s concept of ‘circular reaction,’ which that engineering modeling techniques, typically conceived for
is assumed to characterize the process of sensorimotor low-dimensional applications, cannot be scaled up easily due
learning; that is, the construction of the internal maps to the so-called curse of dimensionality.
between perceptually identified targets and the corresponding
sequence of motor commands.
An additional type of circularity in the organism/environ- Motor Redundancy and the ‘Cybernetics
ment interaction can be identified at the mechanical interface of Purposive Actions’
between the body and the outside world, where the mechanical
properties of muscles interact with the physics of inanimate Since the time of Nicholas Bernstein (1967) it has become clear
objects and gravity. This topic area has evolved from the that one of the central issues in the neural control of
Russian school, with the early work on the nature of reflexes by movements is the ‘degrees of freedom problem,’ that is, the
I.P. Pavlov and the subsequent critical reexamination by computational process by which the brain coordinates the
Anokhin (1974) and Bernstein (1967). In particular, we owe to action of a high-dimensional set of motor variables for
Bernstein the seminal observation (the comparator model) that carrying out the tasks of everyday life, usually described and
motor commands alone are insufficient to determine learned in a ‘task-space’ of much lower dimensionality. Such
movement but only identify some factors in a complex dimensionality imbalance is usually named ‘motor redun-
equation where the world dynamics has a major influence. dancy.’ This means that the same movement goal can be ach-
This led, among other things, to the identification of muscle ieved by an infinite number of combinations of the control
stiffness as a relevant motor parameter and the formulation variables and timing patterns, which are equivalent as far as the
of the theory of equilibrium-point control (Feldman and task is concerned. But in spite of so much freedom, experi-
Levin, 1995; Bizzi et al., 1992). mental evidence suggests that the brain consistently uses
In general, we may say that, in different ways, Helmholtz’s a narrow set of solutions. Consider, for example, the task of
corollary discharge, Piaget’s circular reaction, and Bernstein’s reaching a point B in space, starting from a point A, in a given
comparator model are different ways to express the ecological time T. In principle, the task could be carried out in an infinite
nature of motor control; that is, the partnership between brain number of ways, with regards to spatial aspects (hand path),
processes (including muscles) and world dynamics. On top of timing aspects (speed profile of the hand), and recruitment
this, another type of circularity from a more cognitive point of patterns of the available DoFs (kinematic DoF). In contrast, it
view is suggested by the parietofrontal mirror circuit (Rizzolatti was found that the spatiotemporal structure of this class of
and Sinigaglia, 2010), which is proposed as the unique movements is strongly stereotypical, whatever their amplitude,
mechanism “that allows an individual to understand the direction, and duration: the path is nearly straight (in the
action of others from the inside and gives the observer a first- extrinsic, Cartesian space, not the intrinsic, articulatory space),
person grasp of the motor goals and intentions of other and the speed profile is nearly bell shaped, with symmetric
individuals.” acceleration and deceleration phases (Morasso, 1981). That
On the other hand, these general ideas on motor control this stereotypicity should be attributed to internal control
could not provide, immediately, mathematical tools of analysis mechanisms, not to biomechanical effects, is suggested by the
from which to build models and perform simulations. The art analysis of reaching movements in different types of
and science of building motor control models is a later devel- neuromotor impaired subjects and the adaptation of normal
opment and has been influenced by the methods designed by subjects to disturbing force fields (Shadmehr and Mussa-
engineers in the field of automatic control and computer Ivaldi, 1994).
science. Most of the techniques are based on linear approxi- A movement, per se, is nothing unless it is associated with
mations and explicit schematizations of the phenomena. In a goal and this usually requires recruitment of a number of
particular, two main concepts can be singled out for their joints, in the context of an action. Recognizing the crucial
influence on the study of motor control: the concept of feedback importance of multijoint coordination was really a paradigm
and the concept of motor program, both of them originating shift from the classical Sherringtonian viewpoint (typically
Motor Control Models: Learning and Performance 959

focused on single-joint movements), to the Bernsteinian quest has the opportunity of independently controlling two vari-
for principles of coordination or synergy formation. A coordi- ables: (1) the global equilibrium point, by setting the l-
nated action is a class of movements plus a goal. Redundancy is commands equal to the muscle lengths of a desired posture
a side effect of this connection and thus redundancy is neces- (the reciprocal component of the l-commands); and (2) the
sarily task oriented, something to be managed ‘online’ and global stiffness of the joints, by adding to the previous pattern
rapidly updated as the action unfolds. a set of coactivation l-commands in such a way that the
Generally speaking, actions can be considered as opera- stronger the coactivation, the stronger the stiffness.
tional modules in which descending motor patterns are Beyond the specific neuromuscular details of the l-model,
produced together with the expectation of the (multimodal) EPH is important because it assigns a computational role to the
sensory consequences. Mounting evidence accumulated in the muscles, in addition to its obvious executive action. Summing
last decades from different directions and points of view, such up, the power of the EPH comes from its ability to solve the
as the equilibrium point hypothesis, mirror neurons system, DoF problem by positing that the redundant posture of the
and motor imagery, suggest that in order to understand the whole body is not directly controlled by the brain in a detailed
neural control of movement, the observation and analysis of way but is the biomechanical consequence of the equilibrium
overt movements is just the tip of the iceberg, because what among a large set of muscular and environment forces. In this
really matters is the large computational basis shared by action view, movement is a symmetry-breaking phenomenon, that is,
production, action observation, action reasoning, and action the transition from an equilibrium state to another.
learning.
The Cost-Function-Based Solution to the DoF Problem:
Optimal Control Theory
How the Brain May Deal with Motor Redundancy
Optimal control theory is a classical engineering design tech-
Let us consider again the stereotypicity of reaching movements nique for controlling complex systems in which infinite solu-
that emerges in a robust way from the redundancy of the tions are possible, given a desired task or behavior. The general
human motor system: where is it coming from? Two main idea is that in order to design the best possible controller of
alternatives, although with many variants, have been investi- a system, capable to carry out a prescribed task, one should
gated in the last decades: one is based on force fields and define first a ‘cost function,’ that is, a mathematical combina-
nonlinear attractor dynamics and the other on cost functions tion of the control variables that yields a single number (the
and optimal control concepts. ‘cost’ of the action). This function is generally composed of two
parts: a part that measures the ‘distance’ of the system from the
goal of the action and a part (regularization or penalty term)
The Force-Field-Based Solution to the DoF Problem:
that encodes the required ‘effort.’ The design is then reduced to
The Equilibrium-Point Hypothesis
the computation of the control variables that minimize the cost
The best-known example of a force-field-based solution to the function, thus finding the best possible trade-off between
DoF problem is the Equilibrium Point Hypothesis (EPH) accuracy and effort.
(Asatryan and Feldman, 1965; Feldman, 1966; Bizzi et al., The first attempt to apply this approach to human motor
1976, 1992; Feldman and Levin, 1995). The force field comes control was carried out by Flash and Hogan (1985), by
from the elastic properties of muscles and their capacity to proposing the ‘integrated jerk’ as the regularization term of
store and release mechanical energy. the cost function. ‘Jerk’ is the time derivative of acceleration,
The elastic properties of muscles are well captured by the so- and thus minimizing jerk is equivalent to maximize the
called l-model (Feldman and Levin, 1995). In this model, l is smoothness of the generated trajectory. They showed that the
the controllable parameter that sets the activation threshold of solution of such minimization tasks for point-to-point,
the monosynaptic stretch reflex and thus determines the ‘rest planar reaching movements was indeed consistent with the
length’ of the ‘muscle spring.’ Its value is specified by spatiotemporal invariances found by Morasso (1981).
supraspinal motor commands and expresses a combination However, other simulation studies found similar results by
of the desired levels of muscle length and stiffness. By setting choosing different types of cost functions, such as ‘integrated
the l-commands of all the muscles, the brain implicitly codes torque change’ (Uno et al., 1989) or ‘motor noise dependent
an equilibrium point, determined by the fact that at this statistics’ (Harris and Wolpert, 1998). Thus, it is not clear
point the spring actions cancel each other. In this way, how to identify the cost function, supposedly used by the
movement follows as a mechanical consequence of the force brain, because different alternatives tend to yield similar
field, without a continuous intervention of the brain, and the behaviors.
brain can take advantage of a second type of redundancy (in In this line of research, optimal control concepts were used
addition to the ‘kinematic redundancy,’ determined by the for deriving offline optimal control patterns, to be employed in
excess number of DoFs), namely ‘muscle redundancy,’ feed-forward control schemes. A later development (Todorov
determined by the excess number of muscles in relation to and Jordan, 2002) suggested using an extension of optimal
the number of DoFs. control theory that incorporates sensory feedback in the
A crucial feature is that muscles are not linear springs but are computational architecture. In this closed-loop control tech-
characterized by length–tension curves of exponential type, nique, a block named ‘control policy’ generates a stream of
whose stiffness depends on the difference between the l- motor commands that optimize the predefined cost function
command and the actual muscle length. Therefore, the brain on the basis of a current estimate of the ‘state variables’; this
960 Motor Control Models: Learning and Performance

estimate integrates in an optimal way (by means of a Kalman then the movement can be executed. Otherwise, convergence
filter) feedback information (coming from delayed and noise- failure may play the role of a crucial internal event, namely the
corrupted sensory signals) with a prediction of the state starting point to break the action plan into a sequence of
provided by a forward model of the system’s dynamics, driven subactions, by recruiting additional DoFs, employing tools
by an ‘efference copy’ of the motor commands. One of the most that may allow the realization of the goal, and so forth. In this
attractive features of this formulation, in addition to its sense, PMP can be considered a generalization of EPH from
elegance and apparent simplicity, is that it blurs the difference action execution (overt actions) to action planning and
between feed-forward and feedback control because the control reasoning about actions (covert actions).
policy governs both. On the other hand, the mathematical The dynamics of PMP networks is driven by a kind of
computations that need to be carried out in order to identify internal potential energy, associated with the previously
the optimal control policy are quite complex and do not scale described attractor dynamics. The basic idea is not far away
up well with dimensionality. Moreover, optimal feedback from the ‘free-energy principle’ advocated by Friston (2010) in
control requires indeed feedback, and thus is unable to treat order to account for action, perception, and learning in
overt and covert actions in a uniform way. a unified way; and it is, at the same time, an alternative to
the optimal control theory (Mohan and Morasso, 2011).

Beyond EPH: The Passive Motion Paradigm as


Examples of PMP Networks
a General Approach to Action Generation
!
Let q be the set of all the DoFs that characterize the human
In the EPH framework, the computational core of the mech- body or the body of a humanoid robot, possibly extended by
anism is based on force fields and the associated attractor including the DoFs of a tool. Any given task identifies one or
dynamics rather than on cost functions. In the standard more end-effectors, and is defined by the motion of one end-
formulation, the source of the force fields is physical and is effector with respect to another or to some reference point.
determined by the mechanical properties of muscles. There- The natural reference frame for x(t) is linked to the environ-
fore, this model can only be applied to overt, not covert, ment (extrinsic) space and not the joint (intrinsic) space.
!
movements. However, it is possible to abstract the model to Moreover, the dimensionality of q is generally much greater
the attractor dynamics of interacting neural representations, than the dimensionality of ! x.
and this is proposed by the Passive Motion Paradigm (PMP) The basic idea of the PMP is to express the goal of an action
(Mussa-Ivaldi et al., 1988). The basic idea can be formulated (e.g., ‘reach a target point ! x T ’) by means of an attractive force
!
in qualitative terms by suggesting that the process by which field F tgt ¼ K ð! xT ! x Þ, centered in the target position and
the brain can determine the distribution of work across apply it to the body schema, in particular to the task-related
a redundant set of joints, when the chosen end-effector is end-effector. This force field is mapped into a set of torques
assigned the task of reaching a target point in space, can be applied to the joints according to the Jacobian matrix of the
!
represented as an internal simulation process that calculates arm: T tgt ¼ JT K ð! xT ! x Þ. The displacement of the articula-
how much each joint would move if an externally induced tions is then determined by an admittance matrix A that
force (i.e., the goal) pulls the end-effector by a small distributes the motion among the (redundant) set of joints:
!
amount toward the target. This internal simulation in turn d q ¼ A J T K ð! xT ! x Þ. By integrating over time, this equa-
causes the incremental elastic reconfiguration of the internal tion and mapping articulations displacements into small
!
body schema involved in generating the action, by dissemi- displacements of the end-effector (d! x ¼ J d q ), the whole
nating the force field across the kinematic chain (more body schema will evolve from the initial equilibrium config-
generally, task-specific kinematic graph) that characterizes the uration to a final, desired configuration where the force applied
articulated structure of the human or robotic body. The to the end-effector is null because the end-effector has reached
mechanism is labeled ‘passive’ in line with the EPH. the target. This dynamics of the body schema can also be
The underlying hypothesis is that the equilibrium point is not expressed by means of a graph (or kinematic network) such as
explicitly specified by the brain, which just contributes to the that in the top panel of Figure 1. The graph also includes
activation of ‘task-related’ force fields. The mechanism is a nonlinear gain that allows to reach stability in a prescribed
robust and can be easily extended to high-dimensional time, according to the theory of terminal attractors (Zak, 1988).
problems because there is no need to solve ill-posed inverse The relaxation process, from the initial equilibrium state to the
problems but only to run an internal simulation of the final one, is analogous to the mechanism of coordinating the
dynamic body schema. This mechanism applies equally well motion of a wooden marionette by means of strings attached
to covert and overt actions, and this is consistent with the to the terminal parts of the body: the distribution of the
mounting evidence from brain-imaging studies in support of motion among the joints is the ‘passive’ consequence of the
common neural substrates being activated during both ‘real virtual forces applied to the end-effectors and the virtual
and imagined’ movements. For this reason, it is plausible to compliance of the joints. The extension from a single limb to
posit that also real, overt actions are the results of an internal the whole body of humans or humanoid robots is quite
simulation, where such simulation is a result of the interac- straightforward, as shown in the bottom panel of Figure 1.
tions between an ‘internal body model’ with the attractor This modeling framework can be successfully applied for
dynamics of force fields induced by the goal and task-specific explaining the formation of Whole Body Reaching (WBR)
constraints involved during the performance of any action. If synergies, that is, coordinated movement of lower and upper
the mental simulation converges (i.e., the goal is realized), limbs, characterized by a focal component (the hand must
Motor Control Models: Learning and Performance 961

reach a target) and a postural component (the center of mass


(CoM) must remain inside the support base) (Morasso et al.,
2010). The focal component of the task was modeled by
means of a simple attractive field to the target, applied to the
fingertips of both hands. The postural component was
implemented by an additional force field applied to the
pelvis region. By simulating the network in various
conditions it was possible to show that it exhibits many of
the spatiotemporal features found in experimental data of
WBR in humans (Stapley et al., 1999; Pozzo et al., 2002;
Kaminski, 2007), in particular the fact that the speed profiles
of the hand and the CoM are synchronized (Figure 2).
Figure 3 shows the PMP network used for the bimanual
coordination of the movements of a humanoid robot. The
network, in this case, is composed of three parts, one related
to the left arm, a second related to the right arm, and a third
to the trunk. Task-specific networks are logically derived from
the global whole-body network (the global body schema) by
‘grounding’ some element of the network, that is, inhibiting
the propagation of the task-related force fields beyond
a given articulation or body part. In this sense, the body
schema represented by PMP networks is not a passive map
but is a dynamic structure that recruits at run-time the
redundant DoF of the body in the context of task-related
force fields.

PMP and the Simulation Theory of Covert Actions


Experimental results (in terms of EEG, fMRI, PET, and NIRS)
generally support the idea of common underlying functional
networks subserving both the execution and imagination of
movements (Kranczioch et al., 2009; Munzert et al. 2009). Marc
Jeannerod (2001) went a step forward by formulating the
Mental Simulation Theory (2001), which posits that cognitive
motor processes such as motor imagery, movement
observation, action planning, and verbalization share the
same representations with motor execution. The neural
activation patterns include not only premotor and motor areas
such as the PMC (premotor cortex), SMA (supplementary
motor area), and M1 (primary motor cortex) but also
subcortical areas of the cerebellum and the basal ganglia. In
Figure 1 Top panel. Basic kinematic network that implements the particular, the presence of activity in the typically motor
passive motion paradigm for a simple kinematic chain. It includes the regions suggests that covert actions are in the same motor
Gamma function, which endows the system with terminal attractor format that is required by overt actions. The concurrent
dynamics. This means that equilibrium is not achieved asymptotically
activation of descending motor pathways might be involved
but in finite time. (The corresponding speed profile is also shown.)
in the generation of efference copies that propagate upstream
External and internal constraints (represented as task-dependent force/
torque fields) bias the path to equilibrium in order to take into account
suitable ‘penalty functions.’ This is a multi-referential system of action
representation and synergy formation, which integrates a forward and number of possible targets is carried out implicitly by the combination
an inverse internal model. Middle panel. The figure illustrates the key of different force/torque fields. Bottom panel. Full kinematic network of
element of the architecture of Figure 1 for solving the degrees of the iCub robot (53 DoFs; see Sandini, G., et al., 2004). Each blue box
freedom problem, namely the mapping of the force field, defined in the corresponds to a basic kinematic network (top panel) for a specific body
extrinsic space and applied to the end-effector, into the corresponding segment. ‘Tools’ corresponds to possible interaction points between the
torque field, defined in the intrinsic space and applied to the joints. The external world and the internal body mode; the corresponding green
mapping is implemented by means of the transpose Jacobian matrix of arrows identify potential goals and the related goal-oriented force fields.
the kinematic transformation. Dimensionality reduction is obtained Such force fields are propagated through the global network (blue
implicitly by letting the internal model ‘slide’ in the torque field. Each arrows), searching for a task-dependent equilibrium configuration. A
point of the trajectory in the extrinsic space corresponds to a whole single or multiple time-base generators control the timing. Sandini, G.,
manifold in the intrinsic space (the ‘null space’ of the kinematic trans- Metta, G., Vernon, D., 2004. RobotCub: an open framework for research
formation). The equilibrium point in the force field corresponds to an in embodied cognition. In: Proceed. 4th IEEE/RAS Intl Conf. on
equilibrium manifold in the torque field. The selection among the infinite Humanoid Robots, LA, CA, 13–32.
962 Motor Control Models: Learning and Performance

Figure 2 Coordination of whole-body reaching movements by means of the passive motion paradigm. For the same target, the top panel shows two
motion patterns obtained with different values of the admittance matrix A, namely the hip admittance is reduced from 2.5 rad s1 Nm (left graph) to
0.1 rad s1 Nm (right graph). The bottom panel shows the time course of the joint rotations, including the time base generator G(t ), (left graph) and the
speed profiles of the end-effector and the center of mass.

to parietal and premotor cortices, thus predicting the potential the actor, and learn to replicate the action through
consequences of the planned action. Jeannerod interprets imitation. In a recent study, Mohan et al. (2011)
this brain activity as an ‘internal simulation’ of a detailed demonstrated how humanoid robots can learn a range of
representation of action and uses the term S-states for the motor skills by observing a teacher, with PMP framework as
corresponding mental states. The crucial point, in our view, is a central building block. They further demonstrated how
that if S-states occurring during covert actions are to a great abstract motor knowledge acquired by the humanoid robot
extent quite similar to the states occurring during overt actions, iCub while learning one skill (drawing) can be reused while
then it is not unreasonable to posit that also real, overt actions learning another, quite different skill (controlling a 2 DoF
are the results of an internal ‘simulation’ process. This is the basic toy-crane), hence drastically speeding up learning. These
idea behind the PMP. From this point of view, the simulation of results further support the idea of PMP being a unified
PMP networks is a way to generate what Jeannerod calls computational framework for execution, reasoning, under-
‘S-states.’ standing, and imitation of action, thus suggesting a strong
A closely related issue is that of understanding actions of link between PMP, EPH, simulation theory of covert actions,
conspecifics and learning by imitation. One of the explana- and mirror neurons systems. In sum, PMP networks can be
tions proposed is that observation of other’s actions would activated under a variety of conditions in relation to action,
activate, in the observer’s brain, the same mechanisms that either of oneself or observed from other individuals. Their
would be activated, were that action imagined by the observer function is not only to shape the motor output during action
himself or herself (Gallese and Goldman, 1998). Imitation, execution, but also to provide the self with information on
as suggested by Iacoboni, could be based on directly the feasibility, consequence, understanding, and meaning of
matching the observed action onto an internal simulation potential actions. What remains out of this article is the set of
of that action (Iacoboni, 2009). Internal simulations hence processes that allow ‘action schemas,’ generated by a PMP
play an important role in allowing the observer to foresee mechanism, to interact with neuromuscular and external
the consequence of an action, predict the intended goal of world dynamics.
Motor Control Models: Learning and Performance 963

(a) (b)

(c)

(d) (e)

(f) (g)

Figure 3 Bimanual coordination task (reaching two objects at the same time) for the iCub robot implemented by means of the passive motion paradigm.
Panel (a): kinematic network, with two target goals and a single time-base generator. The network includes three modules: (1) right arm, (2) left arm, and (3)
waist. The dimensionality of JR & JL is 3  10 (this includes the seven DoF’s of the arms and the three DoF’s of the waist). The dimensionality of Aj is 7  7
and of AT is 3  3. The three subnetworks interact through a pair of nodes (‘assignment’ and ‘sum’) that allow the spread of the goal-related activation
patterns. This simplified network is obtained from the global network of Figure 2 by ‘grounding’ the waist, that is, by inhibiting the spread of activation to the
lower limbs. Panels (b and c) show the initial and the final postures of the robot and the two cylindrical target objects. Panels (d and e) show the trajectories
of the two end-effectors and the corresponding speed profiles (together with the output G(t ) of the time base generator). Panel (f) clarifies the intrinsic
degrees of freedom in the left-arm-torso chain. Panel (g) shows the time course of the waist and left-arm joint rotation patterns: J0–J2: joint angles of the
waist (yaw, roll, pitch); J3–J9: joint angles of the right arm (shoulder pitch/yaw/roll; elbow flexion/extension; wrist pronation/supination/pitch/yaw).

Learning Paradigms in Neural Networks turn of the nineteenth century (Ramón y Cajal, 1928) and
and Motor Control the ensuing conjectures that memories are encoded at
synaptic sites (Hebb, 1949) as a consequence of a process
At the core of the theories of neural network models is the of learning. In accordance with this prediction, synaptic
attempt to capture general approaches for learning from plasticity was first discovered in the hippocampus, and
experience tasks that are too complex to be expressed by nowadays it is generally thought that LPT (long-term
means of explicit or symbolic models (Arbib, 1995). The potentiation) is the basis of cognitive learning and memory,
mechanism of learning and memory has been an intriguing although the specific mechanisms are still a matter of
question after the establishment of the neuron theory at the investigation.
964 Motor Control Models: Learning and Performance

Three main paradigms for training the parameters or Feldman, A.G., 1966. Functional tuning of the nervous system with control of
synaptic weights of neural network models have been identified: movement or maintenance of a steady posture, II: controllable parameters of the
muscles. Biophysics 11, 565–578.
(1) ‘Supervised learning,’ in which a teacher or a supervisor
Feldman, A.G., Levin, M.F., 1995. The origin and use of positional frames of refer-
provides a detailed description of the desired response for any ences in motor control. Behavioral and Brain Sciences 18, 723–745.
given stimulus and exploits the mismatch between the Flash, T., Hogan, N., 1985. The coordination of arm movements: an experimentally
computed and the desired response or error signal for modi- confirmed mathematical model. Journal of Neuroscience 7, 1688–1703.
fying the synaptic weights according to an iterative procedure. Friston, K., 2010. Free energy principle: a unified brain theory? Nature Neuroscience
11, 127–138.
The mathematical technique typically used in this type of Gallese, V., Goldman, A., 1998. Mirror neurons and the simulation theory of mind
learning is known as back propagation and is based on reading. Trends in Cognitive Sciences 2, 493–501.
a gradient-descent mechanism that attempts to minimize the Harris, C.M., Wolpert, D.M., 1998. Signal-dependent noise determines motor planning.
average output error (Rumelhart et al., 1986); (2) Nature 394, 780–784.
Hebb, D.O., 1949. The Organization of Behavior. Wiley, New York.
‘Reinforcement learning,’ which also assumes the presence of
Iacoboni, M., 2009. Neurobiology of imitation. Current Opinion In Neurobiology 19,
a ‘supervisor’ or a teacher but its intervention is only 661–665.
supposed to reward (or punish) the degree of success of Jeannerod, M., 2001. Neural simulation of action: a unifying mechanism for motor
a given control pattern, without any detailed input–output cognition. Neuroimage 14, 103–109.
instruction (Sutton and Barto, 1998). The underlying Kaminski, T.R., 2007. The coupling between upper and lower extremity synergies
during whole body reaching. Gait Posture 26, 256–262.
mathematical formulation is aimed at the maximization of Kranczioch, C., Mathews, S., Dean, J.A., Sterr, A., 2009. On the equivalence of
the accumulated reward during the learning period; (3) executed and imagined movements. Human Brain Mapping 30, 3275–3286.
‘Unsupervised learning,’ in which there is no teacher or Marey, E.J., 1894. Le mouvement. Édition Masson, Paris.
explicit instruction and the network is only supposed to Mohan, V., Morasso, P., 2011. Passive motion paradigm: an alternative to optimal
control. Frontiers in Neurorobotics 5 (Art. 4), 1–28.
capture the statistical structure of the input stimuli in order
Mohan, V., Morasso, P., Zenzeri, J., Metta, G., Srinivasa Chakravarthy, V., Sandini, G.,
to build a consistent but concise internal representation of 2011. Teaching a humanoid robot to draw ‘Shapes’. Autonomous Robots 31,
the input. The typical learning strategy is called Hebbian, in 21–53.
recognition of the pioneering work of D.O. Hebb, and is Morasso, P., 1981. Spatial control of arm movements. Experimental Brain Research
based on a competitive or self-organizing mechanism that 42, 223–227.
Morasso, P., Casadio, M., Mohan, V., Zenzeri, J., 2010. A neural mechanism of
uses the local correlation in the activity of adjacent neurons synergy formation for whole body reaching. Biological Cybernetics 102, 45–55.
and aims at the maximization of the mutual information Munzert, J., Lorey, B., Zentgraf, K., 2009. Cognitive motor processes: the role of
between stimuli and internal patterns. motor imagery in the study of motor representations. Brain Research Reviews 60,
How to link the learning paradigms above, which have been 306–326.
Mussa-Ivaldi, F.A., Morasso, P., Zaccaria, R., 1988. Kinematic networks. A distributed
derived for explaining the plasticity of specific neural networks,
model for representing and regularizing motor redundancy. Biological Cybernetics
to learning paradigms that apply to complex ‘naturalistic’ 60, 1–16.
behaviors and thus involve a number of networks as well as the Muybridge, E., 1957. The Human Figure in Motion. Dover Press, New York.
dynamics of the outside world, is still an open question that Piaget, J., 1963. The Origin of Intelligence in Children. Norton Press, New York.
will require significant experimental and theoretical improve- Pozzo, T., Stapley, P.J., Papaxanthis, C., 2002. Coordination between equilibrium and
hand trajectories during whole body pointing movements. Experimental Brain
ments in the coming years. Research 144, 343–350.
Ramón y Cajal, S., 1928. Regeneration in the Vertebrate Central Nervous System.
Oxford University Press, Oxford, UK.
See also: Cerebral Cortex; Classical Mechanics and Motor Rizzolatti, G., Sinigaglia, C., 2010. The functional role of the parieto-frontal mirror
Control; Motor Cortex; Self-Organizing Dynamical Systems. circuit: interpretations and misinterpretations. Nature Reviews Neuroscience 11,
264–274.
Rumelhart, D.E., Hinton, G.E., Williams, R.J., 1986. Learning representations by back-
propagating errors. Nature 323 (6088), 533–536.
Shadmehr, R., Mussa-Ivaldi, F.A., 1994. Adaptive representation of dynamics during
Bibliography learning of a motor task. Journal of Neuroscience 14, 3208–3224.
Stapley, P.J., Cheron, G., Grishin, A., 1999. Does the coordination between posture
Anokhin, P.K., 1974. Biology and Neurophysiology of Conditioned Reflexes and Their and movement during human whole-body-reaching ensure center of mass stabi-
Role in Adaptive Behaviour. Pergamon Press, Oxford, UK. lization? Experimental Brain Research 129, 134–146.
Arbib, M.A., 1995. The Handbook of Brain Theory and Neural Networks. MIT Press, Sutton, R.S., Barto, A.G., 1998. Reinforcement Learning. MIT Press, Cambridge, MA.
Cambridge, MA. Todorov, E., Jordan, M.I., 2002. Optimal feedback control as a theory of motor
Asatryan, D.G., Feldman, A.G., 1965. Functional tuning of the nervous system with coordination. Nature Neuroscience 5, 1226–1235.
control of movements or maintenance of a steady posture. Biophysics 10, Uno, Y., Kawato, M., Suzuki, R., 1989. Formation and control of optimal trajectory in
925–935. human multijoint arm movement. Minimum torque-change model. Biological
Bernstein, N.A., 1967. The Coordination and Regulation of Movement. Pergamon Cybernetics 61, 89–101.
Press, Oxford, UK. von Holst, E., Mittelstaedt, H., 1950. Das Reafferenz prinzip. Wechselwirkungen zwischen
Berthoz, A., 1997. Le sens du mouvement. Édition Odile Jacob, Paris. Zentral nerven system und Peripherie. Naturwissenschaften 37, 464–476.
Bizzi, E., Polit, A., Morasso, P., 1976. Mechanisms underlying recovery of final head Wiener, N., 1948. Cybernetics or Control and Communication in the Animal and the
position. Journal of Neurophysiology 39, 435–444. Machine. Hermann & Cie, Paris; MIT Press, Cambridge, MA.
Bizzi, E., Hogan, N., Mussa Ivaldi, F.A., Giszter, S.F., 1992. Does the nervous system Wolpert, D.M., Kawato, M., 1998. Internal models of the cerebellum. Trends in
use equilibrium-point control to guide single and multiple movements? Behavioral Cognitive Science 2, 338–347.
and Brain Sciences 15, 603–613. Zak, M., 1988. Terminal attractors for addressable memory in neural networks.
Damasio, A.R., 1994. Descartes’ Error. Emotion, Reason and the Human Brain. Physics Letters A 133, 218–222.
Putnam Press, New York.

You might also like