Professional Documents
Culture Documents
Experimental Demonstration of Coupled Learning in Elastic Networks
Experimental Demonstration of Coupled Learning in Elastic Networks
symmetrizing via iterative tuning of individual spring rest lengths. These mechanical networks
could feasibly be scaled and automated to solve increasingly complex tasks, hinting at a new class
of “smart” metamaterials.
[27]. Arinze et al. used autonomous softening of hinges and serves as the “physical cost function” which the
to train thin elastic sheets to fold in desired ways in re- system automatically minimizes subject to the imposed
sponse to force patterns [28] based on a local supervised boundary conditions.
learning rule that softens elements only when they fold We wish to train this network to achieve some desired
correctly [29]. behavior in the positions of its nodes. This is what is
In this paper, we tune each edge’s rest length, rather referred to as a “motion task.” We select certain nodes
than its stiffness. Using rest length as the learning de- to be “inputs” and “outputs” for our task, depicted in
gree of freedom requires special consideration, as each Fig. 1(a) as blue and red filled circles, respectively. When
iterative learning step will alter the equilibrium state of the input node is at position xI , we want the output
the network. Additionally, unlike bond stiffnesses that node’s position xO to be at some desired value, xD .
are directly analogous to conductances in flow networks, We train for this behavior by iteratively adjusting the
the rest length has no direct analogy and is unique to turnbuckle lengths at each edge according to the coupled
mechanical systems. In our approach, a turnbuckle is learning algorithm [16]. Here, the positions of each node
connected to a Hookean spring at its end, and by ad- in the system serve as the “physical degrees of freedom,”
justing the length of the turnbuckle, the effective rest which equilibrate so that there is no net force on each
length of combination spring-turnbuckle edge is altered. edge, while the adjustable rest lengths act as the “learn-
Since updates are not automated but rather made man- ing degrees of freedom” to be adjusted during training.
ually, it is imperative to choose architectures that min- In the free state, Fig. 1(a), we apply the input boundary
imize the number of tunable edges while still exhibit- condition to the network by fixing the position of the in-
ing a diversity of tunable behaviors. Here we provide put node, xI , and allowing all other physical degrees of
two laboratory demonstrations of experimental mechan- freedom to equilibrate. The output node’s position xO is
ical systems that can train using coupled learning with measured, along with the extensions of each edge in the
adjustable rest length components, and we compare with network, sF i . In the clamped state, Fig. 1(b), we once
simulation. Our systems can optimize for a generic task again apply the input boundary condition, but now we
without the use of a computer, placing them among the also enforce a fixed value for the position of the output
first supervised local learning demonstrations in physi- node by “nudging” it towards the desired position xD .
cal mechanical systems. The two minimal architectures The clamped output value xC generally takes the form
presented in this study have been shown to effectively
span the range of possible states to achieve the desired xC = xF + η(xD − xF ), (3)
behavior quickly, smoothly, and effectively. Further, our
where the “nudge factor” 0 < η ≤ 1 is a hyperparameter
experimental demonstrations exhibit successful learning
of the training scheme. In all experiments below, we take
behaviors that are not accounted for by simulation, thus
a full nudge of η = 1 so the output node is clamped at
lending credence to the robustness of our algorithm. The
the desired position at each learning step. The lengths
success of these initial experiments shows promise for
sCi of each edge in this state are also recorded.
scalable learning networks in more complex systems.
In coupled learning, the system evolves by comparing
the mechanical energy of the network in the free and
clamped states. Learning is achieved when the energy in
II. COUPLED LEARNING FOR
SPRING-TURNBUCKLE SYSTEMS the clamped state U C is equal to the energy in the free
state U F . In the absence of nonlinear mechanical effects
like buckling, the difference between these energies, which
We construct a mechanical network such as the one we refer to as the “learning contrast function”, is always
depicted in Fig. 1(a). Each ith edge of this network is non-negative because the clamped state is more strongly
an extension spring with the same stiffness k and rest constrained than the free state, U C −U F ≥ 0. Analogous
length l connected to a rigid turnbuckle with adjustable to the machine-learning approach of minimizing a loss
length Li , as drawn schematically in the inset of Fig. 1(a). function like mean-squared error (MSE), the rest lengths
The total node-node separation of the edge, including the evolve by descending along the gradient of the contrast
length of the rigid part, is si , such that the spring’s ex- function:
tension past its rest length is si −(l+Li ). The mechanical
energy associated with this edge is given by Hooke’s law dLi ∂ hX C i
∝ − uj − uFj (4)
as dt ∂Li
∂
1 2 = − (uC − uF
i ) (5)
ui = k [si − (l + Li )] . (1) ∂Li i
2
= k(sC F
i − si ) (6)
The total mechanical energy in the network is the sum
of the elastic energies of all edges, Since the learning contrast function was not squared, the
X partial derivative in Eq. (4) picks out only the j = i term,
U= uj , (2) which is readily simplified to Eq. (6) using Eq. (1). Thus,
j we arrive at the following general discrete learning rule
3
FIG. 1. Schematic detailing the coupled learning algorithm for an arbitrary elastic network. A mechanical network is constructed
such that each ith edge consists of a spring with stiffness k connected to a turnbuckle with adjustable rest length Li . The springs
act as the edges of the network, and their connection points are nodes. Specific nodes are chosen as “inputs” and “outputs” for
the network to learn a desired task. Some nodes (gray) are fixed to prevent translation and rotation of the entire network, and
the remaining nodes move in response to imposed boundary conditions. (a) In the free state, the position of the input node xI
is enforced, and the position of the output node is measured xO . Each ith edge has length (i.e. node-node separation) sF i . (b)
In the clamped state, the input node’s position is still fixed at xI and the output node is “clamped” to position xC . Each ith
has length sCi . (c) By locally comparing the lengths of each edge, the update rule in Eq. (7) determines how Li evolves.
for how to update each edge at each training step for any network, or two tunable-rest length springs connected in
network of identical spring-turnbuckle edges: series. A photograph of the network is shown in Fig. 2(a).
We construct the network so it is hanging vertically un-
∆Li = α(sC F
i − si ) (7) der gravity from a fixed point, thereby restricting the
motion to be along the axis of gravity. We define the
where α is a per-step learning rate that we shall set in one-dimensional position y = 0 to be at the fixed hang-
experiment, and (sC F
i − si ) is the difference in clamped ing point, and y > 0 measures position downwards from
and free lengths of the edge being updated. This simple this origin. The first spring is connected to the y = 0
rule is purely local: each edge is updates according only fixed point via a turnbuckle of length L1 , terminating at
to it’s behavior, irrespective of how other edges change position y1 . The second turnbuckle of length L2 then
upon clamping. As discussed below, we train with differ- connects to the second spring, which terminates at posi-
ent learning rate parameters in the range 0.1 < α ≤ 1. It- tion y2 . We choose y2 to be the input node of our system
erative updates should drive the global learning contrast and y1 to be the output node. Both springs have equal
function to zero in order to achieve the desired motion stiffnesses k and natural rest lengths l, so that the Eq. (7)
function. learning rule applies. The forces acting on springs in se-
While our Eq. (7) update rule is spatially local, it is ries are equal, so k(y1 − l − L1 ) = k(y2 − y1 − l − L2 )
not temporally local because the learning rule requires si- holds and can be solved for the output node position as
multaneous information about the system in two states.
The experimental implementation in an electrical resistor 1
y1 = y2 + a (8)
network [22] was able to circumvent this issue by build- 2
ing identical twin networks to run the free and clamped where a = 21 (L1 − L2 ). Choosing some desired numerical
states simultaneously. By contrast, the mechanical net- value for a defines a trainable task for our system. This is
work is embedded in space, posing difficulties for con- equivalent to a machine-learning linear regression prob-
structing twin 2-dimensional networks side-by-side, and lem with one variable coefficient [30]. Note that there is
an impossibility entirely for 3-dimensional implementa- no unique solution for L1 and L2 ; only their difference
tions. Therefore, our approach must rely on temporal is trained for. Like typical machine learning algorithms,
memory of the spring extensions between the free and our system is over-parameterized and thus has multiple
clamped states. solutions.
Analogous to the voltage divider in the electrical net- Each unit cell of our network consists of an extension
work case, our first demonstration is a “motion divider” spring coupled to a turnbuckle in series, as shown dia-
4
IV. SYMMETRY NETWORK the horizontal track described previously. The horizontal
position of the top node, xtarget is restricted by placing it
We further demonstrate our system’s ability to learn on a vertical track using the same tool hanger as above,
more complex tasks by constructing a two-dimensional while its vertical position ytarget is allowed to equilibrate
network within a frame. The network is shown schemat- freely.
ically and photographically in Fig. 4(a-b). Four of the Since this network requires measurements in two di-
edges are fixed to rigid a 45 cm×40 cm frame constructed mensions, we chose to automate the measurement process
from 80/20, with the fifth central edge connecting the using a digital camera. As can be seen in the photographs
two internal nodes of the system. Edges are typically in Fig. 4(a-b), we attach pink markers to the free and
stretched past their rest length value by about 6 cm. fixed nodes (6), as well as the points of connection be-
Edges are labeled by number and nodes are denoted as tween the turnbuckles and springs (5). The camera cap-
source (blue) and target (red). tures a three-channel color image of the entire network,
The goal is to train this network to become left-right where we choose a beneficial white balance (2500 K) and
symmetric, which occurs when L1 = L2 and L4 = L5 . tint (M6.0) to maximize the contrast between the mark-
This task may be encoded in the physical degrees of free- ers and the remainder of the image. We then split the
dom (node positions) of the network with the condition: color channels and subtract the green channel from the
red, leaving us with a one-channel image with high inten-
xtarget = xsource = 0 for any ysource value, sity values at the pixels associated with markers and low
where x = 0 is defined to be the midline between the fixed intensity everywhere else. The image is then binarized
nodes on either side. We may then train this network for using a threshold intensity value of 100, which is in be-
this one-input, two-output task using a single data point. tween the well-separated high and low intensity values.
The pixel values that have been identified after binariza-
tion are then clustered into 11 clusters using a k-means
A. Experiment algorithm [31]. The centroid of each cluster is determined
to be the (x, y) position of the associated marker. This
allows us to track all node positions, spring extensions,
The symmetry network was constructed using the same
and turnbuckle lengths at every stage of the experiment.
unit cell design as in the motion divider in Section III.
For a given photograph with 11 markers such as the ones
The same turnbuckles were used in construction of the
in Fig. 4, the full processing and tracking takes approxi-
edges, and keyrings served as the internal nodes of the
mately 0.7 s on a standard Macbook Pro.
system. The springs used here have a natural rest length
The results of training the physical network are shown
of l = 5.5cm and a stiffness of 27.8N/cm (Grainger
in Fig. 4(c-d). The learning degrees of freedom evolve
1NAA2). The external frame of the network was con-
such that L1 comes to meet L2 and L4 meets L5 within
structed from 80/20 parts and eyeholes screwed into the
measurement error of 0.28cm in only 5 training steps.
rail serve as the fixed nodes.
The time evolution of these rest lengths follows a similar
Training this network requires only one input-output
trend as in the motion divider, where the largest adjust-
pair. Both outputs, xsource and xtarget , have desired val-
ments happen in the early training steps and updates
ues that are directly central in the frame. Since training
become asymptotically small as the network approaches
can occur for any choice of the input value, ysource , we
the learned state. See Appendix A for more information
choose the equilibrium position of this node in its initial
about learning dynamics. The success of training may
state for simplicity.
also be measured by the cost function, C = UC − UF .
It is important to note that the physical degrees of
Fig. 4(d) shows that the cost decreases exponentially
freedom of a single node are mixed: the “source” node
from its initial value by about three orders of magnitude
acts as an input in the y-direction and as an output in
overbthe course of training.
the x-direction. The coupled learning algorithm allows
for this seemingly strange coupling, and the network is The middle edge L3 does not exhibit exponential time-
still trainable under this scheme. The mixing of degrees evolution behavior, but rather drifts upward over the
of freedom was achieved experimentally by aligning the course of training by a total of 1.5 cm, or 37.5 % of the
nodes along 80/20 tracks, as can be seen in the pho- range of values. This linear evolution suggests some sys-
tos in Fig. 4(a-b). In the free state, we fix the input tematic experimental bias within the system. Since L3
value for the training task, ysource , by attaching the node has no bearing on the desired state, the network can learn
to a slide-in nylon tool hanger which slots into a hori- the task for any value of L3 , even if we had held its value
zontal 80/20 track so its vertical position is fixed. The fixed over all training steps.
node may still move along the horizontal direction with
a small coefficient of static friction µ ≈ 0.1 for nylon on
dry aluminum. The target node is left unconstrained so B. Simulation
both its degrees are freedom can freely equilibrate. In
the clamped state, we fix both degrees of freedom of the There are many different ways to train for symmetry
bottom node, {ysource , xsource }, by placing stoppers along in our two-dimensional network. Our goal for this task is
6
FIG. 4. Training a two-dimensional network for symmetry. (a-b) Schematics and photographs of the free and clamped states of
the network for a given initial configuration. Edges are labeled by number (a, top). Photographs of the experimental apparatus
show marker tracking system and clamping mechanism. (c-d) Experimental training results. Rest lengths (c) converge to the
symmetric state within experimental precision. The cost C = UC − UF (d) decreases by three orders of magnitude. (e-f)
Simulation training results. The rest lengths (e) evolve such that L1 = L2 and L4 = L5 . Cost (f) decreases exponentially by
five orders of magnitude.
a specified internal state of the network, but the coupled much between the free and clamped states, and therefore
learning scheme requires that we specify a behavior in evolve very slowly. For more information, see Supple-
the nodes that is satisfied by this internal state, of which mental Information, Sec. B. The two-output task allows
there are multiple options. We therefore use simulation for a strong learning signal for all of the edges in the
support to explore the different iterations of our task and network.
examine their behavioral dynamics. We may also use Training was performed using a full nudge η = 1 and
simulation to modify the aspect ratio of the frame, as Eq. (7) with a learning rate of α = 0.4. Successful
the angles of the edges at the internal nodes contributes training is possible with a higher learning rate, but we
to the coupling strength of the learning signal. Having have artificially slowed the training to better examine the
done this, we chose the training task and geometry that learning dynamics. Fig. 4(c) shows the evolution of the
allowed for the most efficient evolution to the learned edges’ rest lengths during the course of training. The
state for our experimental demonstration. outer edges evolve smoothly to meet their desired state,
We simulated training on this network using the FIRE L1 = L2 and L4 = L5 , in approximately 10 training
optimization algorithm to determine the network’s node steps.
positions when boundary conditions are applied [32]. The middle edge’s rest length L3 has no bearing on
Here, each edge was allowed to vary between 0.5 and the desired goal state, yet it still evolves during training,
1.0 length units, and all edge stiffnesses were fixed at particularly in the first few training steps. The desired
the same value, 1. Initial edge lengths were selected at state is not dependent on L3 , but the cost function is,
random with validation that the network was sufficiently as this is the edge which directly connects the two rel-
detuned from its goal state. evant nodes. While evolution of L3 is not required for
The behavior that was used to train the network is the network to learn, allowing it to vary during training
a one-input / two-output task, but we could have al- according to the coupled learning update rule helps the
ternatively defined a one-input, one-output task that is network descend down the cost landscape more quickly.
equally satisfied by the symmetric state of the network. In the experimental trial, Fig. 4(c), L3 systematically
Training on the one-input, one-output task was success- drifts upward, rather than the asymptotic decrease that
ful in simulation, but required very long training times is seen in Fig. 4(e). We attribute this apparent discrep-
that were unfeasible for an experiment with manual op- ancy to frictional effects in the apparatus. While this
eration. In this case, L4 and L5 evolve quickly to their likely inhibited the speed of training, the network was
desired values due to their strong coupling to the target still able to converge on the desired configuration of rest
node. L1 and L2 , however, are only connected directly lengths. This lends credence to the coupled learning al-
to the source node, which does not move in position very gorithm’s robustness even in non-ideal experimental con-
7
[1] Y. LeCun, Y. Bengio, and G. Hinton, Deep learning, Na- edited by D. S. Touretzky, J. L. Elman, T. J. Sejnowski,
ture 521, 436 (2015). and G. E. Hinton (Morgan Kaufmann, 1991) pp. 10–17.
[2] M. Stern and A. Murugan, Learning without [11] M. J. Falk, J. Wu, A. Matthews, V. Sachdeva,
neurons in physical systems, Annual Review N. Pashine, M. L. Gardel, S. R. Nagel, and A. Muru-
of Condensed Matter Physics 14, 417 (2023), gan, Learning to learn by using nonequilibrium training
https://doi.org/10.1146/annurev-conmatphys-040821- protocols for adaptable materials, Proceedings of the Na-
113439. tional Academy of Sciences 120, e2219558120 (2023).
[3] S. Denève, A. Alemi, and R. Bourdoukan, The Brain as [12] N. Pashine, Local rules for fabricating allosteric net-
an Efficient and Robust Adaptive Learner, Neuron 94, works, Physical Review Materials 5, 065607 (2021).
969 (2017). [13] J. Kendall, R. Pantone, K. Manickavasagam, Y. Bengio,
[4] B. Sengupta and M. B. Stemmler, Power Consumption and B. Scellier, Training End-to-End Analog Neural Net-
During Neuronal Computation, Proceedings of the IEEE works with Equilibrium Propagation, arXiv:2006.01981
102, 738 (2014). [cs] (2020).
[5] R. A. John, N. Tiwari, M. I. B. Patdillah, M. R. Kulka- [14] E. Martin, M. Ernoult, J. Laydevant, S. Li, D. Querlioz,
rni, N. Tiwari, J. Basu, S. K. Bose, Ankit, C. J. Yu, T. Petrisor, and J. Grollier, EqSpike: Spike-driven equi-
A. Nirmal, et al., Self healable neuromorphic memtran- librium propagation for neuromorphic implementations,
sistor elements for decentralized sensory signal processing iScience 24, 102222 (2021).
in robotics, Nature communications 11, 4030 (2020). [15] C. Li, D. Belkin, Y. Li, P. Yan, M. Hu, N. Ge, H. Jiang,
[6] R. A. McGovern, A. N. V. Moosa, L. Jehi, R. Busch, E. Montgomery, P. Lin, Z. Wang, W. Song, J. P. Stra-
L. Ferguson, A. Gupta, J. Gonzalez-Martinez, E. Wyl- chan, M. Barnell, Q. Wu, R. S. Williams, J. J. Yang,
lie, I. Najm, and W. E. Bingaman, Hemispherectomy in and Q. Xia, Efficient and self-adaptive in-situ learning in
adults and adolescents: Seizure and functional outcomes multilayer memristor neural networks, Nature Commu-
in 47 patients, Epilepsia 60, 2416 (2019). nications 9, 2385 (2018).
[7] L. G. Wright, T. Onodera, M. M. Stein, T. Wang, D. T. [16] M. Stern, D. Hexner, J. W. Rocks, and A. J. Liu, Su-
Schachter, Z. Hu, and P. L. McMahon, Deep physical pervised Learning in Physical Networks: From Machine
neural networks trained with backpropagation, Nature Learning to Learning Machines, Physical Review X 11,
601, 549 (2022). 021045 (2021).
[8] R. Rojas and R. Rojas, The backpropagation algorithm, [17] D. R. Reid, N. Pashine, J. M. Wozniak, H. M. Jaeger,
Neural networks: a systematic introduction , 149 (1996). A. J. Liu, S. R. Nagel, and J. J. de Pablo, Auxetic meta-
[9] D. Hexner, N. Pashine, A. J. Liu, and S. R. Nagel, Effect materials from disordered networks, Proceedings of the
of directed aging on nonlinear elasticity and memory for- National Academy of Sciences 115, E1384 (2018).
mation in a material, Physical Review Research 2, 043231 [18] C. P. Goodrich, A. J. Liu, and S. R. Nagel, The Prin-
(2020). ciple of Independent Bond-Level Response: Tuning by
[10] J. R. Movellan, Contrastive Hebbian Learning in the Pruning to Exploit Disorder for Global Behavior, Physi-
Continuous Hopfield Model, in Connectionist Models, cal Review Letters 114, 225501 (2015).
8
[19] N. Pashine, D. Hexner, A. J. Liu, and S. R. Nagel, Di- Appendix A: Motion Divider Learning Dynamics
rected aging, memory, and nature’s greed, Science Ad-
vances 5, eaax4215 (2019).
For the motion divider, we can explicitly write out and
[20] J. W. Rocks, N. Pashine, I. Bischofberger, C. P.
Goodrich, A. J. Liu, and S. R. Nagel, Designing allostery- solve for the learning dynamics. In this task, we choose
inspired response in mechanical networks, Proceedings of an input value, y2 , for the position of the end node. The
the National Academy of Sciences 114, 2520 (2017). system is stretched and y2 is held fixed for both the free
[21] J. Laydevant, D. Markovic, and J. Grollier, Training and clamped states during the entire course of training.
an ising machine with equilibrium propagation, arXiv The position of the middle node, y1 , serves as the output.
preprint arXiv:2305.18321 (2023). It will change over the course of training, both between
[22] S. Dillavou, M. Stern, A. J. Liu, and D. J. Durian, free and clamped states and between training steps as L1
Demonstration of decentralized physics-driven learning, and L2 evolve. In the free state, force balance gives the
Physical Review Applied 18, 014040 (2022). position of the middle node as
[23] J. F. Wycoff, S. Dillavou, M. Stern, A. J. Liu, and
D. J. Durian, Desynchronous learning in a physics-driven 1 1
learning network, The Journal of Chemical Physics 156, y1F = y2 + (L1 − L2 ). (A1)
144903 (2022).
2 2
[24] M. Stern, S. Dillavou, M. Z. Miskin, D. J. Durian, and Before training, this differs from the desired output value
A. J. Liu, Physical learning beyond the quasistatic limit,
given by
Physical Review Research 4, L022037 (2022).
[25] R. H. Lee, E. A. Mulder, and J. B. Hopkins, Mechanical 1
neural networks: Architected materials that learn behav- y1D = y2 + a. (A2)
iors, Science Robotics 7, eabq7278 (2022). 2
[26] V. P. Patil, I. Ho, and M. Prakash, Self-learning mechan-
for whatever value the user has chosen for a. During
ical circuits, arXiv preprint arXiv:2304.08711 (2023).
[27] P. Dayan, M. Sahani, and G. Deback, Unsupervised
training, the goal is to adjust the turnbuckle lengths so
learning, The MIT encyclopedia of the cognitive sciences that y1F → y1D , which happens when (L1 − L2 )/2 → a.
, 857 (1999). The update rule to achieve this is based on comparing the
[28] C. Arinze, M. Stern, S. R. Nagel, and A. Murugan, Learn- free state to the clamped state, where the output node
ing to self-fold at a bifurcation, Physical Review E 107, is clamped by nudge factor 0 < η ≤ 1 toward the desired
025001 (2023). position
[29] M. Stern, C. Arinze, L. Perez, S. E. Palmer, and A. Mu-
rugan, Supervised learning through physical changes in a y1C = ηy1D + (1 − η)y1F . (A3)
mechanical system, Proceedings of the National Academy
of Sciences 117, 14843 (2020). The general discrete update rule, Eq. (7), may now be
[30] T. M. Hope, Linear regression, in Machine Learning (El- evaluated by computing the node-node separations of
sevier, 2020) pp. 67–81.
each edge in free and clamped states using the above
[31] S. Lloyd, Least squares quantization in pcm, IEEE trans-
actions on information theory 28, 129 (1982).
expressions. This gives
[32] P. Koskinen, E. Bitzek, F. Gähler, M. Moseler, and
P. Gumbsch, Fire: Fast inertial relaxation engine for op- L1 − L2
∆L1 = αη a − (A4)
timization on all scales, Last accessed: July 30, 2006 2
(2019). ∆L2 = −∆L1 (A5)
[33] S. Misra, M. Mitchell, R. Chen, and C. Sung, Design
and control of a tunable-stiffness coiled-actuator, IEEE Note that these updates are equal and opposite, and van-
International Conference on Robotics and Automation
ish when learning is achieved. Also note that the nudge
(ICRA) (2023).
factor η and update factor α equivalently affect the rate
of learning. Thus, in experiment, we take η = 1 and
vary α without loss of generality. It should be empha-
sized that in the lab we use only Eq. (7) to make the
updates based on measurement of node-node separations
in free and clamped states. In the lab, physics “com-
putes” y1F automatically, whereas here in the appendix
we additionally bring force balance to bear in order to
predict learning dynamics.
To obtain differential equations from the predicted
discrete update rules, one might guess that the time-
derivatives of L1 and L2 should be proportional to the
right-hand sides of Eqs. (A4)-(A5). This turns out to
be true. To see, recall that the elastic energy in the
clamped state must be greater than that in the free state
and the difference serves as the coupled learning contrast
9
C = UC − UF (A6)
2
L1 − L2
= kη 2 a − , (A7)
2