You are on page 1of 19

Engineering Research Express

PAPER • OPEN ACCESS You may also like


- A new formulation of gradient boosting
Bio-inspired navigation and exploration system for Alex Wozniakowski, Jayne Thompson,
Mile Gu et al.
a hexapod robotic platform - Sideways crab-walking is faster and more
efficient than forward walking for a
hexapod robot
To cite this article: Josh Pardo-Cabrera et al 2022 Eng. Res. Express 4 025019 Yang Chen, John E Grezmak, Nicole M
Graf et al.

- A spiking central pattern generator for the


control of a simulated lamprey robot
running on SpiNNaker and Loihi
View the article online for updates and enhancements. neuromorphic boards
Emmanouil Angelidis, Emanuel Buchholz,
Jonathan Arreguit et al.

This content was downloaded from IP address 161.18.5.96 on 31/07/2022 at 19:32


Eng. Res. Express 4 (2022) 025019 https://doi.org/10.1088/2631-8695/ac6bde

PAPER

Bio-inspired navigation and exploration system for a hexapod


OPEN ACCESS
robotic platform
RECEIVED
25 October 2021
Josh Pardo-Cabrera1 , Jesús D Rivero-Ortega1 , Julián Hurtado-López2 and
REVISED
24 March 2022
David F Ramírez-Moreno3
1
Department of Engineering, Universidad Autónoma de Occidente, Colombia
ACCEPTED FOR PUBLICATION
2
29 April 2022 Department of Mathematics, Universidad Autónoma de Occidente, Colombia
3
Department of Physics, Universidad Autónoma de Occidente, Colombia
PUBLISHED
16 May 2022 E-mail: josh.pardo@uao.edu.co

Keywords: neural networks, path integration, central pattern generator, orientation correction, vector summation, bio-inspired robotics
Original content from this
work may be used under
the terms of the Creative
Commons Attribution 4.0
licence. Abstract
Any further distribution of This paper presents a biologically inspired system for guiding and controlling a virtual hexapod robot.
this work must maintain
attribution to the Our navigation and exploration system is composed of subsystems that execute processes of path
author(s) and the title of integration, action selection, actuator control and correction of the robot’s orientation. For the
the work, journal citation
and DOI. subsystem that serves the path integration function we modified an existing model of bio-inspired
vector summation by adding the capability of performing online calculation. For the action selection
subsystem that allows to switch between the behaviors of exploration, approaching a target and
homing we modified an existing model of decision making for mediating social behaviors in mice. We
added an additional circuit that projects a signal to the units representing each of the behaviors. In the
case of the actuator control subsystem, the structure of a central pattern generator model that
incorporates feedback and adaptation was used as the base for generating and transforming signals for
the actuators. Finally, the orientation correction subsystem is a novel model that determines an error
value from a desired and the current orientations. The proposed models were simulated as
independent scripts and then implemented as ROS (Robot Operating System) nodes for controlling a
robot simulation in Gazebo.

1. Introduction

Evolution through natural selection has provided living forms with solutions that allow them to survive in
complex environments. These solutions in the form of bio-mechanical features include different ways of
actuation, varied architectures and materials [1]. Solutions also exist in the form of cognitive skills in more
complex animals, which include navigation, exploration, action selection, decision making, spatial
representation and perception [2]. These traits make animals well suited to perform tasks such as overcoming
rough terrain, hunting, foraging and homing. The ability to perform this kind of tasks is of interest for modern
robotics, in which robots are designed to perform in difficult conditions as in unknown working environments,
achieve complicated goals, and behave as autonomously as possible. Engineers have found great inspiration in
nature in terms of replicating certain animal functions [3] or developing diverse solutions to different problems.
Bio-robotics and neuro-robotics are two examples of disciplines that take advantage of models based on the
study of animal body architecture, perception, behavior, and cognition [4]. Yang et al listed the challenges that
will be faced by the field of robotics the next ten years [5]. Some of those challenges are related to our work in bio-
inspired robots, as well as the improvement of navigation and exploration systems. In addition to take
inspiration from the capabilities of living beings, understanding how they develop is of interest in disciplines
such as health sciences, in which knowing how these mechanisms arise can lead to ways of treating different
diseases and disorders.
In this work, we propose a bio-inspired navigation and exploration system for a hexapod robotic platform
inspired in some aspects of animal behaviors. One of the skills we took in consideration was the generation of

© 2022 The Author(s). Published by IOP Publishing Ltd


Eng. Res. Express 4 (2022) 025019 J Pardo-Cabrera et al

limb actuation signals in animals such as crustaceans [6]. These signals are produced by Central Pattern
Generators (CPGs), and allow the control of limbs from low dimensional signals [7, 8] or in absence of them [9].
Understanding the mechanisms that generate signals for controlling limb movement is of interest in designing
robots planned to sort harsh terrain conditions, as well as in rehabilitation processes for injured people [10].
The research on legged robots has been growing in applications. Traditional architectures of mobile robots
with wheels or tracks [11] can be inappropriate for certain working environments such as terrains with debris
and irregularities. Also, usual mobile robots struggle in tasks with variable conditions or uncertain working
environments. During legged movement, the body balance is preserved and gait patterns continuously adapt to
environmental changes. There are two reasons for studying legged robotics: to develop vehicles that can move on
uneven terrain, and to understand the complexity of human and animal locomotion [12]. Also, understanding
animal locomotion is necessary to propose solutions for rehabilitation devices [13]. Locomotion mechanisms
that mimic legged creatures provide mobility and stability under challenging circumstances and uneven terrain.
However, each leg must have at least two degrees of freedom to pass over obstacles, and the more joints are used,
the greater the complexity due to the variables to control, such as weight, servo motors, power consumption,
among others [14]. The study of legged robots mechanics has also introduced the analysis of extremities with
compliance [15, 16].
The system proposed in this work also relates to competitive selection, which comprises a set of operations
whose purpose can be summarized in selecting a choice from a pool of possible options [17]. These operations
can refer to different levels of complexity, from attention focus tasks to specific motor programs depending on
the environment status. The ability of making decisions, selecting salient stimuli, or moving according to the
context is of fundamental importance in animalʼs survival as they are immersed in feeding, reproduction and
exploration [18]. Some models have been proposed for different kinds of competitive selection tasks. Héricé et al
proposed a biophysically plausible model of the Basal Ganglia to perform decision making and action selection
in a task with a cognitive and motor demands [19]. The modulation of social behaviors in mice was modelled as a
continuous attractor network in [20]. As an example of nature-inspired robots that perform competitive
selection, Barron-Zambrano et al [21] designed a hexapod for obstacle avoidance and target pursuit. They used a
controller based on the finite state machine and fuzzy logic in order to integrate information from the
environment, and modulate hexapod locomotion patterns. They were inspired by neurobiological systems.
In some applications, mobile robots prevent humans to be injured in search and rescue tasks, field
exploration, industrial inspection or in terrain mapping. These tasks usually require to perform competitive
selection at various levels. The pipeline exploration robot described in [22], has a system for decision making in
order to control the actions of the robot. Some applications of human-robot interaction (HRI) mentioned in
[23] include the aforementioned tasks.
Another animal ability is to explore and navigate unknown and dynamic environments over long distances
and return to their starting point in a task known as homing [24]. Homing can be done through a path
integration process, in which the animal can integrate its path with [25] or without external clues [26], such as
visual or acoustic stimuli. Spatial cognition encompasses those tasks in a concept that represents how cognitive
representations of space are created and used for different affairs. In the nervous system, some structures and
cells are thought to be implied in forming these spatial representations, such as Head Direction Cells (HD Cells),
Place Cells, and Grid Cells. Also, recently discovered cells that seem to be implied in the perception of
directionality are bidirectional neurons [27]. Path integration is not only studied in the cognitive science and
ethology sides but also in robotics [28], as understanding this phenomena and the bases of this process also
contribute to new types of spatial representation and navigation algorithms in mobile robots [29]. RHex is a
cockroach inspired robot developed for planetary exploration. It can work for 15 min at maximum speed [30].

2. Methods

Our proposed navigation and exploration system combines the use of bio-inspired units and traditional
computation. The bio-inspired units dynamics is governed by differential equations that describe how the
mean-firing rate changes in time following the Wilson-Cowan model [31]. This view accounts for a network
level of organization [32]. Under this framework there are excitatory and inhibitory units connected in different
ways. These connections are also called synapses in this context. An excitatory unit connected to other unit
contributes to the increase of the firing rate of the second one. Contrarily, an inhibitory unit connected to other
unit causes the firing rate of the second one to decrease.
Equation (1) shows the Wilson-Cowan model, in which E refers to a unit’s mean firing rate, τ is the time
constant, f is a nonlinear activation function that relates the input of a unit to its activity, Wi is the synaptic weight
of the connection between the presynaptic and the postsynaptic units, and Yi is the activity level of the
presynaptic unit.

2
Eng. Res. Express 4 (2022) 025019 J Pardo-Cabrera et al

Figure 1. Hexapod model visualized in Gazebo.

n
dE
t = ⎛⎜ - E + f ⎜⎛å Wi * Yi ⎟⎞ ⎟⎞ (1)
dt ⎝ ⎝i = 1 ⎠⎠

We use the Naka-Rushton funtion as a non-linear activation function that maps a unit’s input to its activity.
From a computational neuroscience view, this allows the model to include unit specific attributes that might
change the output. From a functional perspective, the inclusion of a non-linearity permits to obtain a richer
dynamics which provides more complex behaviors in the agents controlled by our system. The Naka-Rushton
equation is depicted in 2, where m is the maximum value that the function output can reach, and σ (semi-
saturation constant) is the stimulus value at which the output of the function reaches half of the maximum
possible value. On the other hand, u represents the net input and Nj is an exponent that modifies the slope of the
curve.

N
⎧ m j u j ; if u  0,
f j (u) = s Nj + u Nj (2)

⎩ 0; if u < 0.

The proposed models were implemented in Python. The robotics simulations were made in Gazebo. The
models scripts were adapted to be executed by ROS (Robotic Operating System). We tested the fully integrated
system in two computers with different ROS versions and their corresponding default Gazebo versions. The first
computer had Ubuntu 16.04 LTS as the operating system and ROS Kinetic. It had a Intel Core i7-7700HQ,
16 GB RAM and a NVIDIA GeForce GTX 1060 mobile video card. The second computer had a AMD Ryzen 9
5900HS, 16 GB and a NVIDIA GeForce GTX 3060 mobile video card, the operating system was Pop!OS and the
ROS version was Noetic.
The proposed hexapod model had a monocular camera and an inertial measurement unit. They were used in
the last set of experiments, see experimental section. The robot model can be seen in figure 1. Its dimensions are
a width of 45 cm, a length of 100 cm and a height of 18 cm.

3. Experimental

We tested different aspects of our system in different ways. First, we performed individual executions of the
proposed networks for orientation correction, path integration, action selection and locomotion on scripts
running in Python. Then, we compared the locomotion capabilities of the proposed hexapod architecture
against a wheeled robot, and made a comparison between our navigation method and the navigation stack
implemented in ROS using the turtlebot. Finally, we tested the integration of all the systems performing task of
exploration, approaching and homing.

3
Eng. Res. Express 4 (2022) 025019 J Pardo-Cabrera et al

Figure 2. Test track, where the rightmost step is the first and the leftmost is the 10. The separation between them is 1 m.

Table 1. Height of each step.

Step Height(m)

1 0.03
2 0.05
3 0.06
4 0.08
5 0.07
6 0.09
7 0.10
8 0.11
9 0.12
10 0.15

3.1. Scripts
For tuning and testing each system, they were ran separately as Python scripts. For the action selection model,
the testing task was to observe the output of the model to a specific set of stimuli. These stimuli were
characterized as a distance-related signal for aversive and appetizing objects.
For the orientation correction model, the experiment was receiving a series of target orientations, for which
the model had to perform a reorientation task.
For the path integration system, the execution was ran several times to get the error of the distance, angle and
position with respect to the starting position when performing homing after a random exploration phase.
Finally, the CPG model was tested plotting the generated trajectories, the post-processed trajectories and
then checking in Gazebo and ROS that the movement in the hexapod was the desired.

3.2. Locomotion comparison


In this work, we compared the performance between the Clearpath Jackal and our hexapod in a Gazebo
environment. The Jackal is a unmanned ground vehicle widely used for robotic research. There is a model
available in Gazebo and can be controlled by ROS. It has high torque 44 drive train for rugged all-terrain
operation, a weight of 17 kg and max speed of 2.0 m s−1.
A series of obstacles in the form of steps were located in a straight line with increasing height, see table 1. To
compare the stability of each robot on uneven terrain we set the robots at the same distance from the first step
and then allowed them to go in a straight line (figure 2). We executed and recorded three sets of five trials. In each
of the three sets the speed of the jackal was varied (0.5, 1 and 1.5 m s−1). The time each robot required to sort
each obstacle was saved.

4
Eng. Res. Express 4 (2022) 025019 J Pardo-Cabrera et al

Figure 3. Proposed circuit for action selection. Dashed lines represent the inputs to the model given by S1 and S2. Excitatory
connections are represented as lines ending in arrow and inhibitory connections as lines ending in circle.

3.3. Full system simulation


We tested the entire system in a Gazebo + ROS simulation in which the environment was a blank space where
the hexapod began to perform a random walk. A blue or red object is randomly placed in the path of the robot. If
a blue object is placed, the robot would be required to approach it. On the other hand, when the placed object is
red, the robot is supposed to return to the starting point. Three simulations were ran and recorded for each color
and we report the approximate finalizing position of the robot.

4. Results

4.1. Modelling
4.1.1. Action selection model
The proposed action selection model is based on the social behavior control model presented by Hurtado et al in
[20] which corresponds to the results presented by Lee et al in [33]. The model’s structure proposed in Hurtado
et al in [20] is the yellow part of the neural network shown in figure 3. Our model controls the transition among
actions, namely, random search, approaching target, and homing, based on the perceived distance to two types of
objects: the target(blue cylinder) and the dangerous object(red cylinder). In contrast to Hurtado et al in [20], we
included an auxiliary network that allows the network to output three different signals with similar activation
ranges. This addition is shown in figure 3 enclosed in the blue zone.
Distance is obtained applying the similarity of triangles criteria. The introduced hostile and appetizing
objects are modeled as red and blue cylinders with a 0.25 m radius. Distance can be inferred by calculating the
perceived width of the cylinders in the camera. The model inputs correspond to a mapping from the perceived
distance from the robot to an object to the range of operation of the model. The mapping function is slightly
different for S1 and S2 due to the asymmetry that exists in the synaptic weights between units X1 and X2. The
function that maps the distance to a hostile object (dr) to S1 is given by equation (3). Similarly, equation (4)
represents the mapping of the distance to an appetizing object (db) to S2.
The mapping functions parameters were set in an initial stage of the model such that the distance related
signals were transformed to fit the operating range of the bio-inspired units. Asymmetry between mapping
functions for each signal corresponds to differences on the priority perception for each stimulus. For example, if
the robot were presented a situation in which both an aversive and an appetizing object are placed at the same

5
Eng. Res. Express 4 (2022) 025019 J Pardo-Cabrera et al

Table 2. Parameters for the action selection model.

Parameter Description Value

N f1−7 slope parameter 2


τ1 − τ4 Time constant for f1−4 0.10
τ5 Time constant for f5 0.25
τ6 − τ7 Time constant for f6−7 0.20
md Maximum activation for f1 and f5−7 5
me Maximum activation for f2−4 10
σg Semi-saturation constant for f1 1.30
σh Semi-saturation constant for f2−4 3.60
σi Semi-saturation constant for f5 0.004
σj Semi-saturation constant for f6 3.00
σk Semi-saturation constant for f7 40
w11 Synaptic weight of connection from X1 to X1 0.1
w21 Synaptic weight of connection from X1 to X2 1.0
w12 Synaptic weight of connection from X2 to X1 0.6
w22 Synaptic weight of connection from X2 to X2 0.1
w31 Synaptic weight of connection from X1 to X3 3.0
w32 Synaptic weight of connection from X2 to X3 3.0
w41 Synaptic weight of connection from X1 to X4 3.0
w42 Synaptic weight of connection from X2 to X4 3.0
w53 Synaptic weight of connection from X3 to X5 2.0
w54 Synaptic weight of connection from X4 to X5 2.0
w55 Synaptic weight of connection from X5 to X5 0.1
w56 Synaptic weight of connection from X6 to X5 15.0
w57 Synaptic weight of connection from X7 to X5 15.0
w61 Synaptic weight of connection from X1 to X6 10.0
w66 Synaptic weight of connection from X6 to X6 2.0
w72 Synaptic weight of connection from X2 to X7 10.0
w77 Synaptic weight of connection from X7 to X7 2.0

distance, it should always prioritize the action corresponding to the aversive object.

⎧- 0.375 * d r + 3.5, if d r Î [0, 4.2)


S1 = - 0.01749 * d r + 0.27, if d r Î [4.2, 8) (3)

⎩ 0.11, if d r Î [8, +¥)

⎧- 0.25 * db + 2.25, if db Î [0, 4.2)


S2 = - 0.01749 * db + 0.25, if db Î [4.2, 8) (4)

⎩ 0.11, if db Î [8, +¥)

The mutual inhibition between units X1 and X2 shows three options: a high activation level at X1 and zero at
X2, a high activation level at X2 and zero at X1, or a low activation level at both units. These configurations lead to
three outputs (return, explore, approach) with similar activation levels, which require the proposed structure
formed by units X3, X4 and X5. Units X3 and X4 evaluate if there is a difference in the activation level of units X1
and X2. Unit X5 serves as the output. Since the network might express the existence of a difference, strong
inhibitory connections given by units X6 and X7 onto unit X5 are required. Thus, if pure high activation levels
occur in units X1 or X2, only units X6 or X7 will be activated and not unit X5. On the other hand, if the activation
levels in X1 and X2 are low, the difference is still detected, but in that case, the units X6 and X7 exert a minor
inhibition that is unable to cancel the activation of X5. To maintain the execution of critical behaviors even
without the presence of a stimulus, a self-exciting connection was introduced in units X6 and X7.
The parameters shown in table 2 were found by trial and error, looking for obtaining the desired behavior of
the model. Units X1 and X2 serve as a winner-take-all circuit, and they converge to a decision depending on the
inputs S1 and S2. Then units X3 and X4 calculate the absolute value of the difference between activity levels in X1
and X2. If the activity level in one of the input units is very high, it will cause the activation of units X6 or X7, and
the suppression of X5. If the activity level of the input units is low and exists a difference in their activation level,
X5 is highly stimulated.
Equations (5)–(11) represent the model shown in figure 3. Xi represents the mean firing rate of unit i, τi refers
to the time constant of unit i. The activation function of unit i is designated as fi and wij indicates the synaptic
weight of the connection originating in unit j and ending in unit i. The Naka-Rushton function was used in this
model. The value of the constants and synaptic weights are specified in table 2.

6
Eng. Res. Express 4 (2022) 025019 J Pardo-Cabrera et al

Table 3. Parameters for Path Integration model.

Parameter Description Value

a Synaptic weight of connection from near neighbors to unit RΩ 0.0050


b Synaptic weight of self inhibitory connection in unit RΩ 0.0068
c Synaptic weight of connection from far neighbors to unit RΩ 3.0
n Number of units encoding orientation 12
τ Time constant 10

dX1
t1 = - X1 + f1 (S1 - w12 X2 + w11 X1) (5)
dt
dX
t2 2 = - X2 + f2 (S2 - w21 X1 + w22 X2) (6)
dt
dX
t3 3 = - X3 + f3 (w31 X1 - w32 X2) (7)
dt
dX 4
t4 = - X 4 + f4 (w42 X2 - w41 X1) (8)
dt
dX
t5 5 = - X5 + f5 (w53 X3 + w54 X 4 - w65 X6
dt
- w 75 X7 + w 55X5) (9)
dX6
t6 = - X6 + f6 (w61 X1 + w66 X6) (10)
dt
dX
t7 7 = - X7 + f7 (w 72 X2 + w 77 X7) (11)
dt

4.1.2. Path integration


This system relies on a path integration module to perform the homing operation when it encounters a
dangerous object. The proposed path integration model is based on the vector summation model explained in
[31]. That model requires to compute the vector summation after completing the walk, while our model is
capable of integrating the path at every stop. Our model allows to integrate segments described by a distance and
an angle, so that an absolute position of the hexapod can be inferred. This model is composed by a ring of twelve
units, and each of them responds strongly to a specific angle. Absolute distance is taken as the maximum spiking
rate of a unit in this model, and absolute direction is decoded from the levels of activity in the three more active
units by using equation (13) taken from [31].
Equation (12) represents the path integration model, where RΩ corresponds to the spiking rate of the unit
with preferential direction Ω, and Lθ represents the length of a path segment traveled into a direction θ. Near
neighbors exert excitatory influence on the unit with preferential direction Ω with a strength of a, and units
located far away of such unit inhibit it with a synaptic weight c. Each unit also inhibits itself to counterbalance the
excitation produced by its near neighbors. Table 3 contains the model parameters. These parameters were
established by tuning the model until obtaining the desired output. The values of a and b are small enough
because changes in them made the output variate widely, while c needs to be large enough for the model to
effectively produce only one bump of activity instead of several.
W 30 
dRW
t = - RW + ⎛⎜Lq cos (W - q ) + a å Ra
dt ⎝ a =W 15 
W+ 45

- bRW - c å RW⎟ (12)
b =W- 45 ⎠+

qperceived =
( )
360
n
(Rmax - 1 - Rmax + 1)
- Wmax (13)
2(Rmax + 1 + Rmax - 1 - 2Rmax )

4.1.3. Orientation correction model


The objective of this model is to direct the turning of the robot depending on information about the aimed and
the actual orientations. In figure 4 units with blue background encode the desired orientation, while white
background units encode the measured angular position. They will be further referred as A layer and M layer
respectively. Each one of the units in A and M layers has a preferential direction to which they respond with a

7
Eng. Res. Express 4 (2022) 025019 J Pardo-Cabrera et al

Figure 4. Proposed circuit for orientation correction. Red lines refer to the inhibitory connections originating in the reference layer
(Blue units) ending in the L layer(Magenta units). Blue lines correspond to the inhibitory connections originating in the current
orientation layer(White units) ending in the R layer(yellow units). Black lines correspond to excitatory connections from the layers
that encode orientation to the units that compute differences. They are arranged in a somatotopic way.

greater firing rate, following the functioning of head direction cells. Units are grouped according to its
preferential direction. The encoding process is done by calculating the projection of vectors corresponding to
reference and measured positions over vectors corresponding to the preferential direction of each ensemble.
Units in layers R and L allow knowing if there exists any difference between the aimed and measured
orientations. Units in layer R receive excitatory activity from A units situated in its same ensemble and receive
inhibition from M units located in ensembles at its right side. Similarly, units in L layer receive excitation from
units situated in M in its same group, and receive inhibition from A units at ensembles at its right side. Unit TL
accumulates activity from units in L and TR does the same to R units.
dA i
t = - A i + f A (k cos (Wi - qreference)). (14)
dt
dMi
t = - Mi + f M (k cos (Wi - qmeasured)) (15)
dt
3
dL i ⎛ ⎞
t
dt
= - L i + f L ⎜M i - å Ai+j ⎟ (16)
⎝ j=1 ⎠
3
dR i ⎛ ⎞
t
dt
= - R i + f R ⎜A i - å Mi + j ⎟ (17)
⎝ j=1 ⎠
n
dTL
tT = - TL + fT ⎛⎜å Ll ⎞⎟ (18)
dt ⎝l = 1 ⎠
n
dTR
tT = - TR + fT ⎛⎜å Rl ⎞⎟ (19)
dt ⎝l = 1 ⎠
Equations (14)–(19) describe the dynamics of the orientation correction model, where Ai and Mi correspond
to the activities of units in layers A and M in the ensembles with the preferred direction Ωi. In the same way, Li
and Ri relate to the activity in layers L and R and the corresponding preferential direction ensembles. TL and TR
describe the dynamics of the accumulation units. All the units in this model use the Naka-Rushton activation
function, represented by f. Their parameters are shown in table 4, which were established by first testing
preliminary plausible values and then fine tuning them. The strength of all connections in this model are equal to
one, mainly because the dynamics in this model is given by the architecture. While this model was thought and

8
Eng. Res. Express 4 (2022) 025019 J Pardo-Cabrera et al

Table 4. Parameters for orientation correction model.

Parameter Description Value

N Slope of Naka-Rushton function 2


m Maximum activation level 100
σA Semi-saturation constant of Naka-Rushton of fA 25
σM Semi-saturation constant of Naka-Rushton of fM 25
σL Semi-saturation constant of Naka-Rushton of fL 50
σR Semi-saturation constant of Naka-Rushton of fR 50
σT Semi-saturation constant of Naka-Rushton of fT 50
τ Time constant of Naka-Rushton of fA,R,L,M 1.5
τT Time constant of Naka-Rushton of fT 0.2

designed as a way of controlling angular position, we think it could be useful in other scenarios as in the control
of the walk of a robot to a desired target position.
The output of this model is given by units TL and TR, which means that the model makes the robot turn left
or right respectively. When testing this model, the update in the angular position is given by equation (20) in
which Δ M corresponds to the change in the simulated measured orientation. α and β are scalars modulating
the amount of change to be applied and are also a possible source of asymmetry that permit turning when the
reference and measured orientations are opposite.
DM = aTL - bTR (20)
However, in the context of the ROS and Gazebo simulation, the output is given as a difference of the activity
values in TL and TR evaluated in three ranges, as shown in equations (21)–(22). The output (conveniently
chosen) is then published in a topic that relates to the direction of the robot.
DT = TL - TR (21)

⎧1 if DT Î (0.15, ¥)
output = - 1 if DT Î [ - 0.15, 0.15] (22)

⎩ 0 if DT Î ( -¥ , 0.15)

4.1.4. Central pattern generator


A modification to a Central Pattern Generator (CPG) model in [31] is proposed in order to generate the signals
to control the robot’s joints. Figure 5 shows the model presented. The added ensemble (blue background in
figure 5) of units in each side of the output of the original model allows changing the dynamics of the signals.
Equations (23)–(32) describes the dynamic of the model and their parameters are shown in table 5, where the
Naka-Rushton parameters were selected by heuristic with biologically plausible values. The time constants and
synaptic weights were established by trial-and-error method looking for the desired output, the synaptic weights
influence the shape and the threshold change the duty cycle.
dE1
tE = - E1 + f E (K1 - aE2) (23)
dt
dE
tE 2 = - E2 + f E (K2 - aE1) (24)
dt
dY
tY 1 = - Y1 + bE1 (25)
dt
dY
tY 2 = - Y2 + bE2 (26)
dt
dP
tP 1 = - P1 + f P (gE1 - dQ1) (27)
dt
dP2
tP = - P2 + f P (gE2 - dQ 2) (28)
dt
dQ
tQ 1 = - Q1 + fQ ( P1 + eQ1) (29)
dt
dQ 2
tQ = - Q 2 + fQ ( P2 + eQ 2) (30)
dt
dZ
tZ 1 = - Z1 + f Z (zP1 - q ) (31)
dt

9
Eng. Res. Express 4 (2022) 025019 J Pardo-Cabrera et al

Figure 5. Proposed network for Central Pattern Generator. Lines ending in circle correspond to inhibitory connections. Lines ending
in arrow refer to excitatory connections.

Table 5. Parameters of the CPG model.

Parameter Description Value

N Slope of the Naka-Rushton function in this model 2


m Maximum activity level 100
σE Semi-saturation constant for fE 120 + Yi
σP Semi-saturation constant for fP 2.5*Qj
σQ Semi-saturation constant for fQ 50.0
σZ Semi-saturation constant for fZ 15.0
τE Time constant for fE 0.4
τY Time constant for fY 50.0
τP Time constant for fP 0.05
τQ Time constant for fQ 50.0
τZ Time constant for fZ 1.0
α Synaptic weight of the connections between E1 and E2 1.0
β Synaptic weight of the connection from E to Y 1.5
γ Synaptic weight of the connection from E to P 1.0
δ Synaptic weight of the connection from Q to P 2.5
ò Synaptic weight of the connection from P to Q 1.0
ε Synaptic weight of the connection from Q to Q 0.4
ζ Synaptic weight of the connection from P to Z 1.0

dZ 2
tZ = - Z 2 + f Z (zP2 - q ) (32)
dt
The output of the neural network had to be mapped to the limits of the robot joints. Equations (33) and (34)
represent the mapping process.
Ui (t ) = (E i (t ) 70) - 0.4 (33)
Ti (t ) = Z i (t ) 100 (34)

4.2. Simulations
4.2.1. Action selection model
In the first simulation, from t = 0 ms to t = 5 ms, dr and db (distance to a hostile object and an appetizing object,
respectively) are equal to 10 m, consequently, as can be seen in figure 6 during this period of time, the activities in
X1 and X2 are close to zero, although the activity in X1 is slightly higher. It is noticeable that the unit with
predominant activity is X5 that corresponds to exploratory behavior. In the lapse t = 5 ms to t = 15 ms the

10
Eng. Res. Express 4 (2022) 025019 J Pardo-Cabrera et al

Figure 6. Action selection model simulation, transition from exploration to return.

Figure 7. Action selection model simulation, transition from exploration to approximation.

distance of both agents changes to 4 m. It can be noticed that the asymmetry in the mapping functions results in
S1 being greater than S2. The unit X1 presents a predominant level of activity, which leads to X6 being activated,
leading to the expressed conduct. Finally, at t = 15 ms the distance to hostile and appetizing agents is changed to
10 m, however, due to the existing self-excitation in X6, the level of activity persists, so does the return behavior.
In figure 7 between t = 5 ms and t = 15 ms, distances to the hostile and to the appetizing agents are
established to be 20 m and 3 m respectively, resulting in the expression of the approach behavior. As in the
previous simulation, due to the self-excitation connection, the high activity level is preserved now in X7 even
when db has increased.

4.2.2. Orientation correction model


When testing in a standalone way, the update rule of the agent’s orientation is given by equation (20). Figure 8
shows a task given to the model in which it had to reach a reference value. Each time a value was presented, the
model reached it approximately in 50 ms. Figure 9 shows the activation level of the units that promote turning in
a counter clock-wise direction (blue line) and in the clock-wise direction (orange line). From t = 0 ms to
t = 100 ms the reference value is 90°, and the initial orientation is 0°. From a radial point of view, the setpoint is
shifted to the left of the agent. In the same lapse of time in figure 9 the unit that promotes turning in to the left is
more active. This shows that the model is capable of updating the orientation of the agent according to the
relative orientation of both the reference value and its current orientation.

11
Eng. Res. Express 4 (2022) 025019 J Pardo-Cabrera et al

Figure 8. Change in the orientation in time. Red line represents the reference value that the agent must reach. The black discontinuous
line represents the orientation of the agent.

Figure 9. Orientation correction model. Blue line corresponds to the spiking rate in the unit that promotes turning to the left, and the
red line represents the activation level of the unit that promotes turning to the right.

4.2.3. Path integration model


We run our model over several times in order to test it in random trajectories, saving the last error in position,
orientation and distance. Error in position was calculated as the euclidean distance between the expected
position and the integrated one, divided by the trajectory length. In the case of orientation, it was calculated as
the difference in orientations over 180. Finally, distance was reported as the relative error between expected and
integrated distances.
Each individual step to be integrated by the model was calculated by adding a random angle between −45°
and 45° to an initially random orientation. Each step length was calculated as a random number between 0.99 m
and 1.01 m. In table 6 is shown the percentage of tests in which the error is less than 10%.

4.2.4. Central pattern generator


This work introduces changes in the parameters of the basic oscillator of Wilson [31], which generate the blue
and red signals in figure 10, which correspond to the first degree of freedom (DoF) of the hexapod’s leg. Black
and magenta lines represent the signals for the second DoF of the robot’s leg, and were obtained by reshaping the
central oscillator waveform via adding P, Q and Z units and tuning their parameters. These signals were post-
processed to adapt their amplitude to the ranges of each joint as shown in figure 11.

12
Eng. Res. Express 4 (2022) 025019 J Pardo-Cabrera et al

Figure 10. Signals produced by the central patter generator model. Red and blue lines correspond to the spiking rate of the units E1 and
E2 respectively. Dashed lines black and magenta, correspond to the activity in units Z1 and Z2.

Figure 11. Post processed signals that represent the angle of a joint in the hexapod.

Table 6. Error measurement.

Iterations with error 10% or less

Number of repetitions Distance Angle Position

10 90.00% 90.00% 90.00%


100 97.00% 98.00% 95.00%
1000 96.20% 98.10% 96.00%
10000 95.75% 97.60% 94.94%

Due to the nature of the connectivity of the basic oscillator, the output signals have a phase difference of
almost 180 degrees, making the hexapod’s walking stable. Besides, the post-processing of the signals allows to
avoid collisions between the robot links.

4.2.5. Locomotion paradigm comparative analysis


For the trials on uneven terrain as we can see in figure 12 the maximum obstacle reached by the Jackal was the 9th
one at a speed of 1.5 m s−1, but we can observe that the obstacles reached are very dispersed, because sometimes
in the fastest speed test there are times when the robot overturned when it hit a step. At a slower speed, the

13
Eng. Res. Express 4 (2022) 025019 J Pardo-Cabrera et al

Figure 12. Histogram of the maximum obstacle sorted by the Jackal.

Figure 13. Histogram of the times that is required for the Jackal to sort the last obstacle.

obstacle reached is almost always the same. In terms of time, as expected, the Jackal took longer to reach the last
obstacle at a speed of 0.5 m s−1 than at 1.5 m s−1 (figure 13). As for the hexapod, it always reached the 9th step,
and in general it tends to take less time to reach this step than the maximum time as shown in figure 14.

4.2.6. Navigation capabilities comparative


For this test, we divided the simulations in two, initially with a random trajectory and at the end of it the red
cylinder is presented, then the turtlebot was indicated from the beginning of the simulation the position of the red
cylinder to measure the time in which it went and returned doing the homing, the values of the position of the
cylinder, the final position of both the hexapod and the turtlebot and the simulation time are shown in tables 7–9.
In some of the six attempts, as in the first one, the turtlebot showed erratic behavior in each trajectory, having
a longer simulation time and being farther away than the hexapod, even when it made the trajectory of going
directly to the red cylinder and returning to the initial point.

4.2.7. Integrated system simulation


All the aforementioned systems were integrated as shown in figure 15. The diagram relates the subsystems of the
hexapod. The green blocks correspond to routines that perform online integration of differential equations that
represent the bio-inspired neural networks. The violet blocks are routines that execute classical computation

14
Eng. Res. Express 4 (2022) 025019 J Pardo-Cabrera et al

Figure 14. Histogram of the times that is required for the hexapod to sort the last obstacle.

Table 7. Test all points turtlebot.

Attemps X danger obj Y danger obj Final X pos Final Y pos Time homing

1 −18.1817 −10.0283 0.0107 −4.4137 14:40


2 −16.0158 −2.9322 −0.7325 0.6727 3:30
3 −16.8657 0.6799 0.4852 0.1282 6:15
4 −15.2084 −7.9892 0.2448 0.1319 5:50
5 −16.5258 −2.1673 −0.2834 0.1289 5:34
6 −17.0782 5.9919 0.2727 2.9476 5:48

Table 8. Test all points hexapod.

Attemps X danger obj Y danger obj Final X pos Final Y pos Time homing

1 −18.1817 −10.0283 −0.1893 −0.6378 8:54


2 −16.0158 −2.9322 −4.6312 −0.7173 5:40
3 −16.8657 0.6799 −5.1228 1.1096 5:32
4 −15.2084 −7.9892 −2.1879 −1.7716 7:13
5 −16.5258 −2.1673 −4.0377 −0.6041 5:38
6 −17.0782 5.9919 −1.2610 0.5177 7:14

algorithms. The yellow block is the simulation of the hexapod in gazebo, and the hexagonal block symbolizes the
sensors on it.
For the trials in which a red object was presented, the robot was able to perform homing as expected.
However, there was a constant of the robot not reaching exactly the initial (0, 0) coordinates. The average
distance from the starting to the final position of the robot was 3.864 m with a standard deviation of 0.546 m,
with an accuracy up to 90%.
In the trials in which the presented object was blue, the robot was able to approach it. The mean distance
from the final position of the robot to the target object was 0.918 m with a standard deviation of 0.702 m.

5. Discussions

We show that from a separately execution perspective, the proposed models comply with the expected
functions. The action selection model is able to produce a signal to trigger behaviors as exploration, homing and

15
Eng. Res. Express 4 (2022) 025019 J Pardo-Cabrera et al

Figure 15. Diagram that relates the subsystems of the virtual hexapod.

Table 9. Test only final point turtlebot.

Attemps X danger obj Y danger obj Final X pos Final Y pos Time homing

1 −18.1817 −10.0283 −0.8980 −0.6988 10:37


2 −16.0158 −2.9322 −0.0868 −0.8440 8:47
3 −16.8657 0.6799 −0.4605 0.0243 3:36
4 −15.2084 −7.9892 0.2453 0.0974 4:05
5 −16.5258 −2.1673 0.2581 −0.1107 5:49
6 −17.0782 5.9919 −0.2582 0.1661 3:18

approaching. The path integration model is able to store a signal of the walked path and direct movement to the
initial point with an accuracy up to 90% as mentioned before. The orientation correction model can successfully
redirect a virtual agent when the target head direction is changed. The central pattern generator system can
produce trajectories for the robot extremities.
We compared the benefits on uneven terrain locomotion of our proposed legged robot with a standard
rugged terrain prepared wheeled robot (Clearpath Jackal). We showed that in general our robot was able to reach
higher steps than the wheeled robot. However, our robot speed is significantly slower, since for a legged robot of
its size, it is difficult to be faster.
We observe that the turtlebot3 platform guided by the navigation stack package presents a lot of inaccuracy
in certain occasions. In general the hexapod robot guided by our navigation system is slower when performing
homing, but the navigation stack presents more variability, being very fast in general, but in some moments the
robot struggles to find the target location. Our hypothesis is that this is caused by the kind of environment it is in.
The experiments are carried on a environment with no obstacles, which is the natural environment for using the
navigation stack, as it is able to guide the robot through obstacles. Also it uses a cost map to determine which
locations are physically allowed. Because there are not obstacles in these environment, the costmap tells the
system that any location is allowed. As our system does not includes the obstacle avoiding capability, it would not
be suitable to be used in the same kind of environment as the navigation stack.
When integrating all the proposed models into a control system for a hexapod robot, we found it achievable
to perform the intended behavior modulation or action selection in a virtual environment. The robot was able to
approach a blue object (appetizing agent) and to turn away and go back home when a red object (aversive agent)
was found. Also, the robot could perform exploration by means of a random walk. In terms of homing, our

16
Eng. Res. Express 4 (2022) 025019 J Pardo-Cabrera et al

system was not able to completely reach the exact starting position. The path integration model requires to finish
each step to fully integrate the movement. In the scenario when the stimuli are presented randomly, if the robot
is in the middle of a step, that step is lost. The system is more sensitive to this when performing homing, because
it requires to compute the distance from the starting point to the point it encounters the aversive object.
However, for the approaching behavior it only relies in the computed orientation. After reorientation, it is
guided by the vision system. In comparison with other models for path integration as [25], ours is not focused in
copying the biological structures in the nervous system but to take neurobiology as inspiration to propose
models that allow to imitate biological behaviors. As mentioned before, the model can go from a way to control
angular position to the control of the walk of a robot to a desired target position.
We illustrated the applications of biologically inspired neural networks in the control of robots, especially
in mobile robots that are intended to be used in exploration, navigation or mapping tasks. In future
developments, a better implementation of the system could be done, taking advantage of existent models and
updating some of our implementation to allow multi-robot simulation and interaction. Other interesting
approaches would be to propose models as central pattern generators for sub-actuated and soft robots. There,
the coupling between the actuator trajectory and the body parts dynamics represents a challenge in terms of
modelling and control.

Acknowledgments

JPC and JRO proposed the orientation correction network. The authors are grateful to Universidad Autonoma
de Occidente for its support.

Data availability statement

The data that support the findings of this study are available upon reasonable request from the authors.

ORCID iDs

Josh Pardo-Cabrera https://orcid.org/0000-0001-5268-1854


Jesús D Rivero-Ortega https://orcid.org/0000-0002-5044-1846
Julián Hurtado-López https://orcid.org/0000-0002-3773-0598
David F Ramírez-Moreno https://orcid.org/0000-0003-2372-3554

References
[1] Patek S and Summers A 2017 Invertebrate biomechanics Current Biology 27 R371–5
[2] Gao Z, Shi Q, Fukuda T, Li C and Huang Q 2019 An overview of biomimetic robots with animal behaviors Neurocomputing 332 339–50
[3] Bagheri Z M, Cazzolato B S, Grainger S, O’Carroll D C and Wiederman S D 2017 An Autonomous Robot Inspired by Insect
Neurophysiology Pursues Moving Features in Natural Environments 14 046030
[4] Floreano D, Ijspeert A J and Schaal S 2014 Robotics and neuroscience Current Biology 24 R910–20
[5] Yang G-Z et al 2018 The grand challenges ofScience robotics Science Robotics 3 eaar7650
[6] Selverston A I 2010 Invertebrate central pattern generator circuits Philosophical Transactions of the Royal Society B: Biological Sciences
365 2329–45
[7] Mombaur K et al 2017 Chapter 4—control of motion and compliance Bioinspired Legged Locomotion ed M A Sharbafi and A Seyfarth
(United Kingdom: Butterworth-Heinemann) pp 135–346
[8] Ijspeert A J 2008 Central pattern generators for locomotion control in animals and robots: a review Neural Netw. 21 642–53
[9] Popovic M B, Lamkin-Kennard K A and Bowers M P 2019 5-control and physical intelligence Biomechatronics ed M B Popovic (New
York, NY: Academic) pp 109–38
[10] Zhou S, Guo Z, Wong K, Zhu H, Huang Y, Hu X and Zheng Y-P 2021 Pathway-Specific Cortico-Muscular Coherence in Proximal-to-
Distal Compensation During Fine Motor Control of Finger Extension After Stroke 18 056034
[11] Tedeschi F and Carbone G 2014 Design issues for hexapod walking robots Robotics 3 181–206
[12] Raibert M H 1986 Legged Robots That Balance (Massachusetts, MA: Massachusetts Institute of Technology)
[13] Yongtian H, Eguren D, Azorín J M, Grossman R G, Luu T P and Contreras-Vidal J L 2018 Brain-Machine Interfaces for Controlling
Lower-Limb Powered Robotic Systems 15 021004
[14] Böttcher S 2006Principles of Robot Locomotion (https://www2.cs.siu.edu/~hexmoor/classes/CS404-S09/RobotLocomotion.pdf)
[15] Sprowitz A, ajallooeian M, Tuleu A and Ijspeert A 2014 Kinematic primitives for walking and trotting gaits of a quadruped robot with
compliant legs Frontiers in Computational Neuroscience 8 27
[16] Moro F L, Spröwitz A, Tuleu A, Vespignani M, Tsagarakis N G, Ijspeert A J and Caldwell D G 2013 Horse-like walking, trotting, and
galloping derived from kinematic motion primitives (kMPs) and their application to walk/trot transitions in a compliant quadruped
robot Biol. Cybern. 107 309–20
[17] Mysore S P and Kothari N B 2020 Mechanisms of competitive selection: a canonical neural circuit framework eLife 9 e51473
[18] Hoke K L, Hebets E A and Shizuka D 2017 Neural circuitry for target selection and action selection in animal behavior Integr. Comp.
Biol. 57 808–19

17
Eng. Res. Express 4 (2022) 025019 J Pardo-Cabrera et al

[19] Héricé C, Khalil R, Moftah M, Boraud T, Guthrie M and Garenne A 2016 Decision making under uncertainty in a spiking neural
network model of the basal ganglia Journal of Integrative Neuroscience 15 515–38
[20] Hurtado-López J, Ramirez-Moreno D F and Sejnowski T J 2017 Decision-making neural circuits mediating social behaviors J. Comput.
Neurosci. 43 127–42
[21] Barron-Zambrano J H, Torres-Huitzil C and Girau B 2015 Perception-driven adaptive cpg-based locomotion for hexapod robots
Neurocomputing 170 63–78 advances on Biological Rhythmic Pattern Generation: Experiments, Algorithms and Applications Selected
Papers from the 2013 International Conference on Intelligence Science and Big Data Engineering (IScIDE 2013) Computational
Energy Management in Smart Grids
[22] Kim J, Sharma G and Iyengar S S 2010 Famper: a fully autonomous mobile robot for pipeline exploration 2010 IEEE International
Conference on Industrial Technology pp 517–23
[23] Murphy R R 2004 Human-robot interaction in rescue robotics IEEE Transactions on Systems, Man, and Cybernetics, Part C
(Applications and Reviews) 34 138–53
[24] Heinze S, Narendra A and Cheung A 2018 Principles of insect path integration Current Biology : CB 28 R1043–58
[25] Stone T, Webb B, Adden A, Weddig N B, Honkanen A, Templin R, Wcislo W, Scimeca L, Warrant E and Heinze S 2017 An
anatomically constrained model for path integration in the bee brain Current Biology (https://doi.org/10.1016/j.cub.2017.08.052)
[26] Issa J B and Zhang K 2012 Universal conditions for exact path integration in neural systems Proc. Natl Acad. Sci. 109 6716–20
[27] Page H J I and Jeffery K J 2018 Landmark-based updating of the head direction system by retrosplenial cortex: a computational model
Frontiers in Cellular Neuroscience 12 191
[28] Savelli F and Knierim J J 2019 Origin and role of path integration in the cognitive representations of the hippocampus: computational
insights into open questions J. Exp. Biol. 222
[29] Zeng T, Li X and Si B 2020 Stereoneurobayesslam: a neurobiologically inspired stereo visual slam system based on direct sparse method
arXiv:2003.03091
[30] Saranli U, Buehler M and Koditschek D E 2001 Rhex: a simple and highly mobile hexapod robot The International Journal of Robotics
Research 20 616–31
[31] Wilson H R 1999 Spikes, Decisions and Actions (Oxford: Oxford University Press)
[32] Trappenberg T 2010 Fundamentals of Computational Neuroscience (New York, NY: Oxford University Press)
[33] Lee H, Kim D-W, Remedios R, Anthony T, Chang A, Madisen L, Zeng H and Anderson D 2014 Scalable control of mounting and attack
by esr1 neurons in the ventromedial hypothalamus Nature 509

18

You might also like