Update Monte Carlo Tree Search (UMCTS) Algorithm For Heuristic Global Search of Sizing Optimization Problems For Truss Structures

Update Monte Carlo tree search (UMCTS) algorithm for heuristic
global search of sizing optimization problems for truss structures
Fu-Yao Ko1・Katsuyuki Suzuki1・Kazuo Yonekura1
Abstract
Sizing optimization of truss structures is a complex computational problem, and the

reinforcement learning (RL) is suitable for dealing with multimodal problems without
gradient computations. In this paper, a new efficient optimization algorithm called
update Monte Carlo tree search (UMCTS) is developed to obtain the appropriate design
for truss structures. UMCTS is an RL-based method that combines the novel update
process and Monte Carlo tree search (MCTS) with the upper confidence bound (UCB).
Update process means that in each round, the optimal cross-sectional area of each
member is determined by search tree, and its initial state is the final state in the previous
round. In the UMCTS algorithm, an accelerator for the number of selections for
member area and iteration number is introduced to reduce the computation time.
Moreover, for each state, the average reward is replaced by the best reward collected
on the simulation process to determine the optimal solution. The proposed optimization
method is examined on some benchmark problems of planar and spatial trusses with
discrete sizing variables to demonstrate the efficiency and validity. It is shown that the
computation time for the proposed approach is at least ten times faster than the branch
and bound (BB) method. The numerical results indicate that the proposed method stably
achieves better solution than other conventional methods.
1. Introduction
A truss is a two or three-dimensional structure composed of linear members joined

together at their end points to sustain loads, and the force acting at each end is directed
along the axis of the member (Haftka and Gürdal 1992; Christensen and Klarbring
2009). The primary goal of truss optimization is to minimize the weight of structures
with no violation of constraints, such as stress, displacement, and buckling limits. Most
algorithms assume that the design variables are continuous (Groenwold and Stander
1997; Schutte and Groenwold 2003; Kaveh and Bakhshpoori 2016). However, due to
the construction and manufacturing practices, the design variables must be chosen from
a list of discrete values (Stolpe 2016).
Originally, many non-heuristic methods based on linear programming (Templeman
and Yates 1983; Duan 1986; John et al. 1987) and mixed integer programming (Kanno
and Guo 2010; Yonekura and Kanno 2010; Kanno and Yamada 2017; Shahabsafa et al.
2018) have been developed and frequently used to solve truss optimization problems.
Fu-Yao Ko
kofuyao@g.ecc.u-tokyo.ac.jp
1
The University of Tokyo, 7-3-1 Hongo,
Bunkyo-ku, Tokyo, 113-8656, Japan
These classes of methods are gradient-based, and it means that a relationship between
the design variables and the objective function is required to determine the path towards
an optimal solution. Gradient-based optimization tools only guarantee to converge to a
local optimum when the optimization problem is non-convex. Furthermore, these
approaches are not applicable when the design variables are discrete because the
gradients may become singular across the boundary of discontinuity.
In the past three decades, many studies have been reported for discrete truss
optimization based on metaheuristic algorithms, such as genetic algorithm (GA)
(Rajeev and Krishnamoorthy 1992), ant colony optimization (ACO) (Camp and Bichon
2004), harmony search (HS) (Lee et al. 2005), particle swarm optimization (PSO) (Li
et al. 2009), artificial bee colony (ABC) (Sonmez 2011), mine-blast algorithm (MBA)
(Sadollah et al. 2012), subset simulation algorithm (SSA) (Li and Ma 2015), artificial
coronary circulation system (ACCS) (Kooshkbaghi et al. 2020), and dragonfly
algorithm (DA) (Jawad et al. 2021). Metaheuristic algorithm is a stochastic
optimization method using randomization and local search, and this approach is
developed based on observing the random phenomena in nature. These methods
perform poorly when the number of design variables is high. Moreover, the quality of
solution strongly depends on the initial solution and parameters of the algorithm.
Therefore, the validity of initial solution and parameters must be confirmed only by
trial-and-error, which needs substantial computation time.
Reinforcement learning (RL) is a subclass of machine learning (ML) method and
focuses on the behavioral learning of an agent that interacts with a dynamic
environment. The purpose of RL is to maximize the cumulative numerical reward by
training an action taker called agent to take actions. The training of an agent can be
regarded as a trial-and-error process, and the agent gradually learns how to map
different states in a sequential decision process to optimal actions (Sutton and Barto
2018; Chhabra et al. 2019; Ramu et al. 2022). Monte Carlo tree search (MCTS) is a
well-known method for finding optimal decisions in given domain by taking random
samples in the decision space and building a search tree according to the results
(Browne et al. 2012). MCTS has had a lot of success in board games (Silver et al. 2016;
Fu 2019) and in other applications, such as planning, security, chemical synthesis,
scheduling, vehicle routing, and so on (Świechowski et al. 2023).
Recently Luo et al. (2022a) formulated the truss layout design problem into
sequential decision process and introduced a two-stage MCTS-based RL method called
AlphaTruss to generate the optimal truss layout considering continuous member size,
shape, and topology. The performance of AlphaTruss was compared with that of GA.
The authors declared that AlphaTruss outperformed in finding the truss layout with the
minimum weight under stress, displacement, and buckling constraints in 2D benchmark
problems of truss structures. Moreover, Luo et al. (2022b) developed MCTS-based RL
algorithm called KR-UCT to deal with continuous action spaces for truss layout design
problems by using kernel regression. The established algorithm was tested in various
layout design problems of planar and spatial trusses and demonstrated the effectiveness
of the proposed method.
To the best of the author’s knowledge, the MCTS-based RL method has not been
applied to sizing optimization of truss structures with discrete variables. Therefore, the
attention of this study is to present a new optimization method, called update Monte
Carlo tree search (UMCTS) to find the appropriate cross-sectional areas of truss
members from a permissible list of standard sections. This algorithm is an RL-based
method that combines the novel update process and MCTS with the upper confidence
bound (UCB). Update process means that the optimal selection for member area of the
truss in each round is found by MCTS, and its initial state is the final state in the
previous round. In this algorithm, accelerator for the number of selections for member
area and iteration number is proposed to improve the time efficiency. Furthermore, for
given state, the best reward collected on the simulation process instead of the average
reward is used to find optimal actions. The details are presented in Sections 3.1 and 3.2.
The remainder of this paper is organized as follows. Section 2 briefly introduces the
statement of truss sizing optimization problems with discrete design variables and the
basic concept of RL and MCTS. Section 3 provides a theoretical background for the
methodology on how UMCTS algorithm solves the discrete valued truss optimization
problems. In Section 4, time efficiency, convergence history, accuracy, and stability of
the proposed method in finding optimal truss optimization solutions are demonstrated,
in which several well-studied truss structures from the literature have been examined,
and the results are compared with numerous considered optimization algorithms.
Finally, the conclusions of this study are discussed in Section 5.
2. RL task for discrete structural optimization problems
2.1. Problem formulation
Many problems in engineering have multiple solutions, and selecting the appropriate
one is a challenging task. A truss optimization design with discrete variables can be
formulated as a nonlinear programming problem with multiple nonlinear constraints
concerning the structural behaviors. This problem is non-deterministic polynomial-time
hardness. In discrete sizing optimization problems, the cross-sectional areas of the truss
members are chosen as the design variables, which are selected from a sorted set of
standard cross-sectional areas. The problem is constructed to find a design that
minimizes the weight of the structures, while stress and displacement constraints are
satisfied. Thus, the optimal truss structural problem with discrete design variables can
be mathematically formulated as
𝑛𝑔 𝜓𝑖
Minimize 𝑊 (𝐀) = 𝜌 ∑ (𝑎𝑖 ∑ 𝑙𝑖𝑗 ). (1)

𝑖=1 𝑗=1
In Eq. 1, 𝑊(𝐀) is the weight of the truss which is minimized. 𝐀 =
(𝑎1 , 𝑎2 , … , 𝑎𝑖 , … , 𝑎𝑛𝑔 ) is the sizing variable vector including member area 𝑎𝑖 selected
from a sorted set 𝐃 of available discrete values in section type 𝑖. Members with section
type 𝑖 have the same cross-sectional area 𝑎𝑖 . For convenience, group is used to
represent section type in this article. 𝐃 is the sorted set including all available discrete
values arranged in ascending sequences and can be given as
𝐃 = {𝑑1 , 𝑑2 , … , 𝑑ℎ , … , 𝑑𝑛𝑏 }, 1 ≤ ℎ ≤ 𝑛𝑏 , (2)
where ℎ is the index of the available discrete values; 𝑛𝑏 is the total number of available
sections. 𝜌 is the material density. 𝑛𝑔 is the total number of groups. 𝚿 =
{𝜓1 , 𝜓2 , … , 𝜓𝑖 , … , 𝜓𝑛𝑔 } is a sorted set containing elements 𝜓𝑖 which indicates the total
number of the members in group 𝑖. 𝑗 is the index of the member in group 𝑖. 𝑙𝑖𝑗 is the
length of the member 𝑗 in group 𝑖. The constraint functions are given as
𝐊(𝐀)𝐔 = 𝐅,
𝜍𝑖𝑗 = 𝐁𝐢𝐣 𝐔𝐢𝐣 ,
𝐸
𝑠𝑖𝑗 = 𝑙 𝜍𝑖𝑗 ,
𝑖𝑗 (3)
𝑠m ≤ 𝑠𝑖𝑗 ≤ 𝑠M , 𝑖 = 1,2, … , 𝑛𝑔 , 𝑗 = 1,2, … , 𝜓𝑖 ,
𝑢m ≤ 𝑢𝑘 ≤ 𝑢M , 𝑘 = 1,2, … , 𝑛𝑐 ,
𝑇
where 𝐊 is the stiffness matrix of the structure; 𝐔 = (𝑢1 , 𝑢2 , … , 𝑢𝑘 , … , 𝑢𝑛𝑐 ) is the
𝑇
vector of nodal displacements for structures; 𝐅 = (𝑓1 , 𝑓2 , … , 𝑓𝑘 , … , 𝑓𝑛𝑐 ) is the vector of
applied nodal forces for structures; 𝑘 indicates the index of structural joint; 𝑛𝑐 denotes
the total number of structural joints; 𝐸 is the modulus of elasticity; 𝜍𝑖𝑗 is the elongation
of the member; 𝐁𝐢𝐣 is defined as 𝐁𝐢𝐣 = (−𝐞𝐓𝐢𝐣 𝐞𝐓𝐢𝐣 ) ; 𝐞𝐢𝐣 is an unit vector along the
member so that it points from local node 1 to local node 2 and is written as 𝐞𝐢𝐣 =
(cos𝜃𝑖𝑗 sin𝜃𝑖𝑗 )𝑇 ; 𝜃𝑖𝑗 is the angle from the X-axis to 𝐞𝐢𝐣 measured anti-clockwise, i.e.
around the Z-axis; 𝐔𝐢𝐣 = (𝐔𝐢𝐣,𝟏 𝐔𝐢𝐣,𝟐 )𝑇 is a vector including the displacements of the
end points of the member; 𝐔𝐢𝐣,𝟏 and 𝐔𝐢𝐣,𝟐 is defined as 𝐔𝐢𝐣,𝟏 = (𝑢𝑖𝑗,1𝑋 𝑢𝑖𝑗,1𝑌 )𝑇 and
𝐔𝐢𝐣,𝟐 = (𝑢𝑖𝑗,2𝑋 𝑢𝑖𝑗,2𝑌 )𝑇 . Fig. 1 shows the general member in planar truss. In Eq. 3, the
normal stresses 𝑠𝑖𝑗 are compared with the allowable stresses 𝑠m and 𝑠M . Moreover, the
nodal displacements 𝑢𝑘 are compared with the allowable displacements 𝑢m and 𝑢M .
Fig. 1: A general member in the truss (Haftka and Gürdal 1992)
2.2. Key concepts of RL
RL is a ML technology which enables an agent to learn to complete a particular task

within an interactive environment that are expected to maximize the cumulative reward.
RL does not require direct knowledge or a model of the environment. The agent
determines optimal actions in terms of long-term expected rewards. To find the long-
term relationship between states and actions, the agent must find a balance between
exploration (attempting to obtain new knowledge) and exploitation (optimizing
decisions based on existing knowledge).
RL is modeled as a Markov decision process (MDP). The next state and expected
reward are solely dependent on the current state and action. A finite MDP is a discrete
time stochastic sequential decision process defined by a 4 tuple (𝒮, 𝒜, 𝒫, ℛ), where 𝒮
is a finite set of all possible decision states; 𝒜 is a finite set of all valid actions;
𝒫(𝒮, 𝒜, 𝒮 ′ ) is the probabilistic model of the transition; ℛ(𝒮, 𝒜, 𝒮 ′ ) is the reward
function for terminal state. The framework is shown in Fig. 2.
Fig. 2: Framework of RL (Sutton and Barto 2018)
2.3. A survey of MCTS methodology
MCTS is an iterative, guided, random best-first tree search technique for making the
best decision that uses random sampling to explore the specified domain. MCTS starts
with the search tree containing only one node at the root, which are not changed during
the process. Subsequently, the search tree is built incrementally over time and further
expands the search tree by searching the most promising moves until the search time is
terminated. MCTS consists of four main steps, outlined in Fig. 3, which are repeated
iteratively until a stop condition emerges. The four steps are expressed as follows.
⚫ Selection: Starting from the root node, the tree policy is recursively applied to
descend through the tree that selects a child node. When the leaf node is reached,
it is selected for expansion.
⚫ Expansion: All child nodes are added to the selected leaf node according to the
available actions to expand the search tree.
⚫ Simulation: A simulation is run from the leaf node. The algorithm is performed
according to the default policy to produce a simulation result.
⚫ Backpropagation: The simulation result is backpropagated from the selected leaf
node to the root node. Statistics are updated along the tree for selected node during
the selection, and visit counts are increased.
Fig. 3: Strategic steps of the MCTS algorithm (Browne et al. 2012)
The most common selection strategy for MCTS is the UCB given in Eq. 4 (Kocsis
and Szepesvári 2006). UCB employs uncertainty to estimate the values of actions for
balancing exploitation and exploration. UCB includes the two terms: reward term
∑ 𝑟𝑣 /𝑁𝑣 and uncertainty term √𝑙𝑛𝑛𝛶 /𝑁𝑣 as formulated by
∑ 𝑟𝑣 𝑙𝑛𝑛𝛶
𝒰𝑣 = + 𝐶√ , (4)
𝑁𝑣 𝑁𝑣
where 𝑣 is the index of the node; ∑ 𝑟𝑣 is the accumulated merit for node 𝑣 (i.e., the sum
of the immediate merits of all downstream nodes); 𝑁𝑣 is the number of simulations
conducted for node 𝑣; 𝑛Υ is the total number of simulations executed from the root node
Υ; 𝐶 is a heuristic parameter empirically set to balance exploration and exploitation and
is fine-tuned to √2 in this study. In Eq. 4, the reward term ∑ 𝑟𝑣 /𝑁𝑣 is utilized to
encourage the exploitation of actions with higher reward, while the uncertainty term
√𝑙𝑛𝑛𝛶 /𝑁𝑣 is used to encourage the exploration of less-visited actions.
3. UMCTS algorithm for sizing optimization of truss structures
3.1. Update process
The update process of the UMCTS algorithm is shown in Fig. 4. In round 𝑝, the initial
𝑝 𝑝 𝑝 𝑝 𝑝
state is a vector 𝚪 𝐩 = (𝛾1 , 𝛾2 , … , 𝛾𝑖 , … , 𝛾𝑛𝑔 ) including member area 𝛾𝑖 which is
𝐩 𝐩 𝐩 𝐩
selected from a sorted set 𝐃. 𝐃𝟏 , 𝐃𝟐 ,…, 𝐃𝐢 ,…, 𝐃𝐧𝐠 are sorted sets used to determine
𝐩 𝑝 𝑝 𝑝 𝑝
the best solution vector. 𝐃𝐢 = {(𝑑𝑖 )𝑚 , … , (𝑑𝑖 )ℎ , … , (𝑑𝑖 )𝜇 , … , (𝑑𝑖 )𝑀 } contains
𝑝 𝑝 𝑝 𝑝 𝑝 𝑝
elements (𝑑𝑖 )𝑚 , (𝑑𝑖 )𝜇 , and (𝑑𝑖 )𝑀 . (𝑑𝑖 )𝑚 , (𝑑𝑖 )𝜇 , and (𝑑𝑖 )𝑀 are the minimum,
𝐩 𝑝 𝑝
median, and maximum value in 𝐃𝐢 , and (𝑑𝑖 )𝜇 is equal to 𝛾𝑖 . The final state ̅̅̅𝚪𝐩 =
̅̅̅𝑝 , ̅̅̅
𝑝 ̅̅̅𝑝 ̅̅̅̅
𝑝
(𝛾1 𝛾2 , … , 𝛾𝑖 , … , 𝛾𝑛𝑔 ) is found by search tree and is used as an initial solution vector
𝑝+1 𝑝+1 𝑝+1 𝑝+1

𝚪 𝒑+𝟏 = (𝛾1 , 𝛾2 , … , 𝛾𝑖 , … , 𝛾𝑛𝑔 ) in round (𝑝 + 1) . It is worth noting that in
round 1, the initial state is 𝚪 𝟏 = (𝛾11 , 𝛾21 , … , 𝛾𝑖1 , … , 𝛾𝑛1𝑔 ) = (max(𝐃) , … , max(𝐃)). All
the elements of 𝚪 𝟏 are equal to the maximum value in 𝐃. Update process is run for
multiple rounds. The algorithm continues running until the relative error 𝜂 between the
minimum weight 𝑊𝑚1~𝜀−1 = min {𝑊 ̅̅̅̅̅1 , ̅̅̅̅̅
𝑊 2 , … , ̅̅̅̅̅̅̅
𝑊 𝜀−1 } of the final state from round 1 to
(𝜀 − 1) and the weight ̅̅̅̅̅
𝑊 𝜀 of the final state in round 𝜀 below 0.01%. Then, the final
state ̅̅̅
𝚪 𝜀 is the optimal solution of the optimization problem in Section 2.1.
To accelerate the UMCTS algorithm, accelerator for the number of selections for
member area 𝜏 is proposed. The number of selections for member area 𝜏 𝑝 in round 𝑝 is
expressed as
|𝐃| if |𝐃| is odd number,

𝜏 1 = |𝐃𝟏𝟏 | = ⋯ = |𝐃𝟏𝐢 | = ⋯ = |𝐃𝟏𝐧𝐠 | = {
|𝐃| + 1 if |𝐃| is even number,
𝑝−1
𝜁 𝑝 = 𝜏 1 × 0.5⌈ 3 ⌉,
𝜚𝑝 = ⌊𝜁 𝑝 ⌋, (5)
𝜚𝑝 if 𝜚𝑝 is odd number,
𝜅𝑝 = { 𝑝
𝜚 + 1 if 𝜚𝑝 is even number,
𝐩 𝐩 𝐩
𝜏 𝑝 = |𝐃𝟏 | = ⋯ = |𝐃𝐢 | = ⋯ = |𝐃𝐧𝐠 | = max(3, 𝜅 𝑝 ) (𝑝 > 1),
𝐩 𝐩 𝑝−1
where |𝐃𝐢 | is the total number of elements in 𝐃𝐢 ; ⌈ ⌉ is the least integer greater than
3
𝑝−1
or equal to ; ⌊𝜁 𝑝 ⌋ is the greatest integer less than or equal to 𝜁 𝑝 ; max(3, 𝜅 𝑝 ) is the
3
largest value in a set of values 3 and 𝜅 𝑝 .

3.2. MCTS methodology for sizing optimization of truss structures
To apply MCTS search to the sizing optimization of truss structures, it is necessary to

formulate the problem into an MDP. A state is denoted by a tuple (𝚽, 𝚪), which is
expressed as
𝑛𝑔 𝜓𝑖
Minimize 𝑊 (𝚪) = 𝜌 ∑ (𝛾𝑖 ∑ 𝑙𝑖𝑗 ),

𝑖=1 𝑗=1
𝐊(𝚪)𝐔 = 𝐅,
𝜍𝑖𝑗 = 𝐁𝐢𝐣 𝐔𝐢𝐣 , (6)
𝐸
𝑠𝑖𝑗 = 𝑙 𝜍𝑖𝑗 ,
Subject to 𝑖𝑗
𝑠m ≤ 𝑠𝑖𝑗 ≤ 𝑠M , 𝑖 = 1,2, … , 𝑛𝑔 , 𝑗 = 1,2, … , 𝜓𝑖 ,
𝑢m ≤ 𝑢𝑘 ≤ 𝑢M , 𝑘 = 1,2, … , 𝑛𝑐 ,
𝚽 = {𝜑1 , 𝜑2 , … , 𝜑𝑖 , … , 𝜑𝑛𝑔 }.
𝚽 = {𝜑1 , 𝜑2 , … , 𝜑𝑖 , … , 𝜑𝑛𝑔 } is a sorted set including only 0 or 1. 0 means that the

cross-sectional area of that group is modified, and 1 indicates that the member area is
𝐩
unmodified. The only type of action is to select a cross-sectional area in 𝐃𝐢 . After
taking an action, 𝚽 and 𝚪 are changed depending on the consequence of the undertaken
action. Therefore, the state transition is defined as the variation of the tuple (𝚽, 𝚪). The
reward 𝑟 is based on the results from the structural simulation. The structure (𝚽, 𝚪)
goes through stress and displacement constraints and checks if the structure (𝚽, 𝚪)
satisfies them. When existing constraint violations, 𝑟𝑇 is equal to 0. If the structure
(𝚽, 𝚪) passes through all the constraints, 𝑟𝑇 is 𝛼⁄(𝑊𝑇 )2 . 𝛼 is the minimum weight,
which is defined as the weight of the truss with cross-sectional area of all members
equal to the minimum value in D. 𝑊𝑇 is the weight of the truss when reaching the
terminal state 𝑇.
In round 𝑝, new search tree is created which follows the procedure in Fig. 5 and then
determines the optimal action for each state by UCB in Fig. 6. At first, there is only one
𝐩
root node in the search tree. The state of the root node Υ is 𝚽𝚼 = {1, 1, … , 1, … , 1} and
𝐩 𝑝 𝑝 𝑝 𝑝
𝚪𝚼 = (𝛾1 , 𝛾2 , … , 𝛾𝑖 , … , 𝛾𝑛𝑔 ) . When creating a search tree, it follows the four-step
repetition described in Fig. 3 (Section 2.3). In each iteration, the most promising child
node is selected using UCB formula (Eq. 4). Subsequently, the algorithm expands the
search tree and performs a simulation. At the end of an iteration, the algorithm uses the
received reward 𝑟 to update the information from the leaf node to the root node. It is
𝑝
worth mentioning that Eq. 7 uses the maximum value of reward (𝑟𝑞+1 )𝑀 to calculate
𝑝
the UCB, and 𝑞 is the layer number. The reason for choosing (𝑟𝑞+1 )𝑀 is that the goal
𝑝 𝑝
of this problem is to find optimal solution, and the average of the reward ∑ 𝑟𝑞+1 /𝑁𝑞+1
should not be considered.
𝑝
𝑝 𝑝 2𝑙𝑛𝑛𝛶
𝒰𝑞+1 = (𝑟𝑞+1 )𝑀 + √ 𝑝 (7)
𝑁𝑞+1
𝑝
After conducting a certain number of iterations, (𝑟𝑞+1 )𝑀 is used to determine the
𝐩 𝐩
optimal action for the current state (𝚽𝐪 , 𝚪𝐪 ) . The child node 𝑣 with
𝑝 𝑝 𝑝 𝑝
max {(𝑟𝑞+1 ) , (𝑟𝑞+1 ) , … , (𝑟𝑞+1 ) , … , (𝑟𝑞+1 ) 𝑝 } is chosen. When conducting the state
1 2 𝑣 𝜏
𝐩 𝐩 𝐩 𝐩
transition, state (𝚽𝐪 , 𝚪𝐪 ) is converted into (𝚽𝐪+𝟏 , 𝚪𝐪+𝟏 ). Moreover, the state of the root
𝐩 𝐩 𝐩 𝐩
node becomes (𝚽𝐪+𝟏 , 𝚪𝐪+𝟏 ). When state (𝚽𝐧𝐠 , 𝚪𝐧𝐠 ) is reached, the algorithm completes
a round, and the final state ̅̅̅

𝚪 𝐩 for round 𝑝 is determined.
It is known that MCTS is a technique of which the accuracy improves with more
iterations. Unfortunately, the efficiency of UMCTS is also an important issue. To
accelerate the UMCTS algorithm, Eq. 8 is proposed to determine the number of
𝑝
iterations 𝐼𝑞 for layer 𝑞 in round 𝑝. Π is the product notation that used in mathematics
to indicate repeated multiplication.
𝑛𝑔
𝑝
For root node Υ 𝐼Υ = 2|𝐃| × ⌈log10 (∏ |𝐃𝐩𝐪 |)⌉
𝑞=1
𝑛𝑔 (8)
𝑝
For layer 𝑞 𝐼𝑞 = |𝐃| × ⌈log10 (∏ |𝐃𝐩𝛏 |)⌉
𝜉=𝑞+1
(a) Flowchart
(b) Schematic diagram
Fig. 4: Update process in the UMCTS algorithm
Selection Expansion
Simulation Backpropagation
Fig. 5: Procedure for search tree creation

Fig. 6: Flowchart and schematic diagram of optimal action determination
4. Numerical examples
In this section, four truss optimization problems with discrete variables are used to
investigate the effectiveness of the proposed algorithm. The time efficiency is tested
against a 10-bar planar truss and a cantilever planar truss structure. The results are
presented and compared to the Basic Open-source Nonlinear Mixed Integer
programming (BONMIN) method, which is branch and bound (BB) algorithm using
interior point method and COIN-OR branch and cut (Bonami et al. 2008). Moreover, a
10-bar planar truss, a 25-bar spatial truss, and a 72-bar spatial truss are examined to
investigate convergence history, solution accuracy, and solution stability of the present
approach. The UMCTS algorithm and the direct stiffness method for the analysis of
truss structures are coded in Python programming software. The computations are
carried out with Intel Core i7 2.50 GHz processor and 40 GB memory.
4.1. Problem description
The following four test problems, including a 10-bar planar truss, a cantilever planar
truss structure, a 25-bar spatial truss, and a 72-bar spatial truss are introduced.
Fig. 7 demonstrates the geometry of the 10-bar truss structure. The material density
𝜌 is 0.1 lb/in3. The modulus of elasticity 𝐸 is 10,000 ksi. The truss members are
subjected to stress limitations of ±25 ksi. The displacements of structural joints in both
X and Y directions have to be less than ±2 in. The load cases are listed in Table 1. The
10 truss members are categorized into 10 groups, as follows: (1) 𝐴1 , (2) 𝐴2 , (3) 𝐴3 , (4)
𝐴4 , (5) 𝐴5 , (6) 𝐴6 , (7) 𝐴7 , (8) 𝐴8 , (9) 𝐴9 , and (10) 𝐴10 . Three discrete variable cases are
investigated. For Case 1, the discrete variables are selected from the sorted set
𝐃 ={1.62, 1.80, 1.99, 2.13, 2.38, 2.62, 2.63, 2.88, 2.93, 3.09, 3.13, 3.38, 3.47, 3.55,
3.63, 3.84, 3.87, 3.88, 4.18, 4.22, 4.49, 4.59, 4.80, 4.97, 5.12, 5.74, 7.22, 7.97, 11.50,
13.50, 13.90, 14.20, 15.50, 16.00, 16.90, 18.80, 19.90, 22.00, 22.90, 26.50, 30.00, 33.50}
in2. For Case 2, the discrete variables are selected from the sorted set 𝐃 ={0.1, 0.5,
1.0,…,30.5, 31.0, 31.5} in2. For Case 3, the discrete variables are selected from the
sorted set 𝐃 ={4, 8, 12,…,36, 40, 44} in2.
The schematic diagram of the cantilever planar truss structure is shown in Fig. 8. The
total number of truss members 𝑛𝐿 in this study are 5, 10, 15, 20, 25, 30, and 40. The
material density 𝜌 is 0.1 lb/in3. The modulus of elasticity 𝐸 is 10,000 ksi. The truss
members are subjected to stress limitations of ±25 ksi. A concentrated load of 50 kips
is applied on the bottom-right corner of the truss in the negative Y-direction. The 𝑛𝐿
truss members are categorized into 𝑛𝐿 groups, as follows: (1) 𝐴1 , (2) 𝐴2 , (3) 𝐴3 ,…, (𝑛𝐿 )
𝐴𝑛𝐿 . The relationship of the allowable displacement 𝑢m,M and the total number of
members 𝑛𝐿 is given as
𝑛𝐿 /5 𝑛𝐿 = 5, 10, 15, 20, 25,

𝑢M = {2𝑛𝐿 /5 𝑛𝐿 = 30, , 𝑢m = −𝑢M . (9)
4𝑛𝐿 /5 𝑛𝐿 = 40,
Configuration of the 25-bar spatial truss is given in Fig. 9. All members are assumed
to be constructed from a material with a material density 𝜌 of 0.1 lb/in3 and a modulus
of elasticity 𝐸 of 10,000 ksi. All members are constrained to 40 ksi in both tension and
compression. Moreover, maximum displacement limitations of ±0.35 in are imposed at
uppermost nodes in every direction. The 25 truss members are parametrized into eight
groups, as follows: (1) 𝐴1 , (2) 𝐴2 – 𝐴5 , (3) 𝐴6 – 𝐴9 , (4) 𝐴10 – 𝐴11 , (5) 𝐴12 – 𝐴13 , (6)
𝐴14 – 𝐴17 , (7) 𝐴18 – 𝐴21 , and (8) 𝐴22 – 𝐴25 . The discrete variables are selected from the
sorted set 𝐃 ={0.1, 0.2, 0.3,…,2.4, 2.5, 2.6, 2.8, 3.0, 3.2, 3.4} in2. The load case applied
to the 25-bar spatial truss is described in Table 2.
The 72-bar spatial truss structure is shown in Fig. 10. The problem properties include
the material density 𝜌 of 0.1 lb/in3 and the modulus of elasticity 𝐸 of 10,000 ksi. The
limitation of the stress for the truss members in tension and compression is 25 ksi. The
allowable displacement for each structural joint is ±0.25 in. The structure is under two
independent loading conditions shown in Table 3. There are 72 truss elements which
are grouped into 16 design variables, as follows: (1) 𝐴1 – 𝐴4 , (2) 𝐴5 – 𝐴12 , (3) 𝐴13 – 𝐴16 ,
(4) 𝐴17 – 𝐴18 , (5) 𝐴19 – 𝐴22 , (6) 𝐴23 – 𝐴30 , (7) 𝐴31 – 𝐴34 , (8) 𝐴35 – 𝐴36 , (9) 𝐴37 – 𝐴40 , (10)
𝐴41 – 𝐴48 , (11) 𝐴49 – 𝐴52 , (12) 𝐴53 – 𝐴54 , (13) 𝐴55 – 𝐴58 , (14) 𝐴59 – 𝐴66 , (15) 𝐴67 – 𝐴70 ,
and (16) 𝐴71 – 𝐴72 . Two structural optimization cases have been studied. In Case 1, the
discrete variables are selected from the sorted set 𝐃 ={0.1, 0.2, 0.3,…,3.0, 3.1, 3.2} in2.
In Case 2, the discrete design variables are picked out from Table 4 according to
American Institute of Steel Construction (AISC) norm.
Fig. 7: A 10-bar planar truss structure
Fig. 8: Cantilever planar truss structure
Fig. 9: A 25-bar spatial truss structure

Fig. 10: A 72-bar spatial truss structure
Table 1: Load cases for the 10-bar planar truss

Node Load case 1 (kips) Load case 2 (kips) Load case 3 (kips)
𝐹𝑋 𝐹𝑌 𝐹𝑋 𝐹𝑌 𝐹𝑋 𝐹𝑌
1 0.0 0.0 0.0 50.0 50.0 −50.0
2 0.0 −100.0 0.0 −150.0 50.0 −50.0
3 0.0 0.0 0.0 50.0 50.0 −50.0
4 0.0 −100.0 0.0 −150.0 50.0 −50.0
Table 2: Load case for the 25-bar spatial truss

Node Loads (kips)
𝐹𝑋 𝐹𝑌 𝐹𝑍
1 1.0 −10.0 −10.0
2 0.0 −10.0 −10.0
3 0.5 0.0 0.0
6 0.6 0.0 0.0
Table 3: Load cases for the 72-bar spatial truss

Nodes Load case 1 (kips) Load case 2 (kips)
𝐹𝑋 𝐹𝑌 𝐹𝑍 𝐹𝑋 𝐹𝑌 𝐹𝑍
17 5.0 5.0 −5.0 0.0 0.0 −5.0
18 0.0 0.0 0.0 0.0 0.0 −5.0
19 0.0 0.0 0.0 0.0 0.0 −5.0
20 0.0 0.0 0.0 0.0 0.0 −5.0
Table 4: Discrete values available for cross-sectional areas from AISC norm
Number Areas Number Areas Number Areas Number Areas
1 0.111 17 1.563 33 3.840 49 11.500
2 0.141 18 1.620 34 3.870 50 13.500
3 0.196 19 1.800 35 3.880 51 13.900
4 0.250 20 1.990 36 4.180 52 14.200
5 0.307 21 2.130 37 4.220 53 15.500
6 0.391 22 2.380 38 4.490 54 16.000
7 0.442 23 2.620 39 4.590 55 16.900
8 0.563 24 2.630 40 4.800 56 18.800
9 0.602 25 2.880 41 4.970 57 19.900
10 0.766 26 2.930 42 5.120 58 22.000
11 0.785 27 3.090 43 5.740 59 22.900
12 0.994 28 3.130 44 7.220 60 24.500
13 1.000 29 3.380 45 7.970 61 26.500
14 1.228 30 3.470 46 8.530 62 28.000
15 1.266 31 3.550 47 9.300 63 30.000
16 1.457 32 3.630 48 10.850 64 33.500
4.2. Investigation of convergence history: 10-bar planar truss
The comparison of convergence histories for the 10-bar planar truss structure under the
best reward and the average reward are given in Fig. 11. The maximum number of
rounds is 40. Red and blue dashed line represent the rate of convergence under the best
reward and the average reward. Figs. 11a and 11b for Case 1 and Case 2 shows that the
proposed method obtains the best solution at 13 and 15 rounds under the best reward.
However, the UMCTS does not detect the best solution after 40 rounds under the
average reward.
The comparison of convergence histories for the 10-bar truss structure considering
different 𝛼 are given in Fig. 12. The maximum number of rounds is 40. Red, blue, and
green dashed line denote the rate of convergence for 𝛼 equal to minimum weight,
median weight, and maximum weight, i.e., the weight of the truss with all member areas
equal to the minimum, median, and maximum value in 𝐃. From Fig. 12a for Case 1, it
can be shown that the UMCTS algorithm obtains the best solution at 14, 35, and 38
rounds when 𝛼 is equal to minimum, median, and maximum weight. From Fig. 12b for
Case 2, it is obviously seen that the UMCTS obtains the best solution at 14, 18, and 25
rounds when 𝛼 is equal to minimum weight, median weight, and maximum weight.
(a) (b)
Fig. 11: Comparison of the convergence histories under the best reward and the
average reward for (a) Case 1 and (b) Case 2
(a) (b)
Fig. 12: Comparison of the convergence histories for 𝛼 equal to minimum, median,
and maximum weight for (a) Case 1 and (b) Case 2
4.3. CPU time comparisons for UMCTS algorithm and BB method
4.3.1. Perspective 1: different type of constraints and loading conditions
Different type of constraints and loading conditions are considered in this part. The 10-
bar truss structure shown in Fig. 7 is used to compare the computational efficiency
between the UMCTS and the BONMIN. Load cases 1 and 2 are applied on this structure.
The sorted set for discrete variables in Case 3 is used. Table 5 reveals the comparison
of CPU time between the UMCTS 𝑡𝜐 and the BONMIN 𝑡𝛽 . It is evident that the
computational cost for the UMCTS algorithm is much less than that for the BONMIN.
Moreover, the CPU time is nearly constant when considering different type of
constraints and loading conditions.
Table 5: Comparison of CPU time between the UMCTS and the BONMIN
considering different type of constraints and loading conditions
4.3.2. Perspective 2: the number of loadings
The number of loadings is considered in this part. The 10-bar planar truss in Fig. 7 is
used to compare the time efficiency between the UMCTS and the BONMIN. Load case
3 is applied on this structure. The sorted set for discrete variables in Case 3 is employed.
The number of loadings is 1, 2, 4, and 8. Table 6 presents the comparison of CPU time
between the UMCTS 𝑡𝜐 and the BONMIN 𝑡𝛽 . It is evident that the computational cost
for the UMCTS algorithm is much less than that for the BONMIN. Moreover, the CPU
time is nearly constant when the number of loadings varies.
considering the number of loadings
4.3.3. Perspective 3: the number of members
The number of members is considered in this part. The cantilever planar truss structure
illustrated in Fig. 8 is utilized to compare the CPU time between the UMCTS 𝑡𝜐 and
the BONMIN 𝑡𝛽 . Table 7 presents the comparison of CPU time between the UMCTS
𝑡𝜐 and the BONMIN 𝑡𝛽 . Asterisk in Table 7 means that this problem cannot be solved
by BONMIN algorithm. It is seen that the computational cost for the UMCTS algorithm
is much less than that for the BONMIN. Furthermore, BONMIN can only find the
optimal solution when the number of members is less than or equal to 15.
considering the number of members
4.4. Solution accuracy
Three benchmark truss structures with discrete design variables, including 10-bar
planar truss, 25-bar spatial truss, and 72-bar spatial truss are used to validate the
solution accuracy by comparing the results which have been previously studied by other
researchers.
Tables 8-12 demonstrate the comparison of optimal design results. It is shown that
the UMCTS outperforms all metaheuristic algorithms in terms of the lightest weight.
Figs. 13-15 reveal that the optimal solution assessed by the UMCTS does not violate
both normal stresses and nodal displacement constraints.
Table 8: Optimal design comparison for Case 1 of the 10-bar planar truss
Variables Optimal cross-sectional areas (in2 )
No Design GA ACO PSO ABC ACCS This study
Variable (UMCTS)
1 𝐴1 33.50 33.50 30.00 33.50 33.50 33.50
2 𝐴2 1.62 1.62 1.62 1.62 1.62 1.62
3 𝐴3 22.00 22.90 30.00 22.90 22.90 22.00
4 𝐴4 15.50 14.20 13.50 14.20 14.20 16.00
5 𝐴5 1.62 1.62 1.62 1.62 1.62 1.62
6 𝐴6 1.62 1.62 1.80 1.62 1.62 1.62
7 𝐴7 14.20 7.97 11.50 7.97 7.97 7.97
8 𝐴8 19.90 22.90 18.80 22.90 22.90 22.00
9 𝐴9 19.90 22.00 22.00 22.00 22.00 22.00
10 𝐴10 2.62 1.62 1.80 1.62 1.62 1.62
Weight (lb) 5613.84 5490.74 5581.76 5490.74 5490.74 5477.32
Constraint violation None None None None None None
Table 9: Optimal design comparison for Case 2 of the 10-bar planar truss
No Design PSO MBA SSA ACCS DA This study
Variable (UMCTS)
1 𝐴1 24.50 29.50 30.00 30.50 30.00 31.50
2 𝐴2 0.10 0.10 0.10 0.10 0.10 0.10
3 𝐴3 22.50 24.00 23.50 23.00 23.00 23.50
4 𝐴4 15.50 15.00 15.00 15.00 15.50 14.50
5 𝐴5 0.10 0.10 0.10 0.10 0.10 0.10
6 𝐴6 1.50 0.50 0.50 0.50 0.50 0.50
7 𝐴7 8.50 7.50 7.50 7.50 7.50 7.50
8 𝐴8 21.50 21.50 21.50 21.00 21.00 21.00
9 𝐴9 27.50 21.50 21.50 22.00 21.50 21.00
10 𝐴10 0.10 0.10 0.10 0.10 0.10 0.10
Weight (lb) 5243.71 5067.33 5067.33 5067.33 5063.51 5052.42
Table 10: Optimal design comparison of the 25-bar spatial truss

No Design HS PSO MBA SSA ACCS This study
Variable (UMCTS)
1 𝐴1 0.1 0.4 0.1 0.1 0.1 0.1
2 𝐴2 – 𝐴5 0.3 0.6 0.3 0.3 0.3 0.1
3 𝐴6 – 𝐴9 3.4 3.5 3.4 3.4 3.4 3.4
4 𝐴10 – 𝐴11 0.1 0.1 0.1 0.1 0.1 0.1
5 𝐴12 – 𝐴13 2.1 1.7 2.1 2.1 2.1 1.5
6 𝐴14 – 𝐴17 1.0 1.0 1.0 1.0 1.0 0.9
7 𝐴18 – 𝐴21 0.5 0.3 0.5 0.5 0.5 0.8
8 𝐴22 – 𝐴25 3.4 3.4 3.4 3.4 3.4 3.4
Weight (lb) 484.85 486.54 484.85 484.85 484.85 479.91
Table 11: Optimal design comparison for Case 1 of the 72-bar spatial truss
Variable (UMCTS)
1 𝐴1 – 𝐴4 2.6 2.0 2.0 2.0 2.3 1.9
2 𝐴5 – 𝐴12 1.5 0.6 0.5 0.5 0.4 0.5
3 𝐴13 – 𝐴16 0.3 0.4 0.1 0.1 0.1 0.1
4 𝐴17 – 𝐴18 0.1 0.6 0.1 0.1 0.1 0.1
5 𝐴19 – 𝐴22 2.1 0.5 1.3 1.3 1.2 1.2
6 𝐴23 – 𝐴30 1.5 0.5 0.5 0.5 0.5 0.5
7 𝐴31 – 𝐴34 0.6 0.1 0.1 0.1 0.1 0.1
8 𝐴35 – 𝐴36 0.3 0.1 0.1 0.1 0.1 0.1
9 𝐴37 – 𝐴40 2.2 1.4 0.5 0.5 0.5 0.5
10 𝐴41 – 𝐴48 1.9 0.5 0.5 0.5 0.5 0.5
11 𝐴49 – 𝐴52 0.2 0.1 0.1 0.1 0.1 0.1
12 𝐴53 – 𝐴54 0.9 0.1 0.1 0.1 0.1 0.1
13 𝐴55 – 𝐴58 0.4 1.9 0.2 0.2 0.1 0.1
14 𝐴59 – 𝐴66 1.9 0.5 0.6 0.6 0.6 0.5
15 𝐴67 – 𝐴70 0.7 0.1 0.4 0.4 0.3 0.4
16 𝐴71 – 𝐴72 1.6 0.1 0.6 0.6 0.8 0.7
Weight (lb) 1089.88 385.54 385.54 385.54 379.20 371.00
Table 12: Optimal design comparison for Case 2 of the 72-bar spatial truss
Variable (UMCTS)
1 𝐴1 – 𝐴4 7.220 0.196 1.990 1.990 1.990 0.602
2 𝐴5 – 𝐴12 1.800 0.563 0.563 0.442 0.442 0.111
3 𝐴13 – 𝐴16 1.130 0.442 0.111 0.111 0.111 0.111
4 𝐴17 – 𝐴18 0.196 0.602 0.111 0.111 0.111 0.111
5 𝐴19 – 𝐴22 3.090 0.442 1.228 1.228 1.000 0.442
6 𝐴23 – 𝐴30 0.785 0.442 0.563 0.563 0.602 0.111
7 𝐴31 – 𝐴34 0.563 0.111 0.111 0.111 0.111 0.111
8 𝐴35 – 𝐴36 0.785 0.111 0.111 0.111 0.111 0.111
9 𝐴37 – 𝐴40 3.090 1.266 0.563 0.563 0.442 0.442
10 𝐴41 – 𝐴48 1.228 0.563 0.442 0.563 0.563 0.111
11 𝐴49 – 𝐴52 0.111 0.111 0.111 0.111 0.111 0.111
12 𝐴53 – 𝐴54 0.563 0.111 0.111 0.111 0.111 0.111
13 𝐴55 – 𝐴58 1.990 1.800 0.196 0.196 0.111 0.442
14 𝐴59 – 𝐴66 1.620 0.602 0.563 0.563 0.563 0.111
15 𝐴67 – 𝐴70 1.563 0.111 0.391 0.391 0.250 0.111
16 𝐴71 – 𝐴72 1.266 0.111 0.563 0.563 0.766 0.111
Weight (lb) 1209.48 390.73 389.33 389.33 385.19 130.31
(a) (b)
Fig. 13: Constraints evaluated at the optimal design of 10-bar planar truss structure by
the UMCTS for (a) displacement constraints and (b) stress constraints
(a) (b)
Fig. 14: Constraints evaluated at the optimal design of 25-bar spatial truss structure by
(a) (b)
Fig. 15: Constraints evaluated at the optimal design of 72-bar spatial truss structure by
4.5. Solution stability
The statistical results for a 10-bar truss, a 25-bar truss, and a 72-bar truss are obtained
through 10 independent sampling to test the stability of this method. The results are
presented in Table 13, including the best, the worst, average, and standard deviation.
For a 10-bar truss in Case 1, the best, the worst, average, and standard deviation are
5477.32 lb, 5516.66 lb, 5488.38 lb, and 10.94, and in Case 2, the best, the worst, average,
and standard deviation are 5052.42 lb, 5067.33 lb, 5060.93 lb, and 3.72. For 25-bar
truss, the best, the worst, average, and standard deviation are 479.91 lb, 488.57 lb,
484.32 lb, and 2.63. For 72-bar truss in Case 1, the best, the worst, average, and standard
deviation are 371.00 lb, 375.80 lb, 372.03 lb, and 1.34, and in Case 2, the best, the worst,
average, and standard deviation are 130.31 lb, 130.31 lb, 130.31 lb, and 0.00.
Table 13: Statistical results of the investigated example
Investigated example Best weight Worst weight Average weight Standard deviation
10-bar planar truss (Case 1) 5477.32 lb 5516.66 lb 5488.38 lb 10.94
10-bar planar truss (Case 2) 5052.42 lb 5067.33 lb 5060.93 lb 3.72
25-bar spatial truss 479.91 lb 488.57 lb 484.32 lb 2.63
72-bar spatial truss (Case 1) 371.00 lb 375.80 lb 372.03 lb 1.34
72-bar spatial truss (Case 2) 130.31 lb 130.31 lb 130.31 lb 0.00
5. Conclusion
A new efficient optimization technique called UMCTS is proposed for the constrained
problem of discrete optimization of truss designs in this paper. UMCTS is an RL-based
algorithm that combines the novel update process and MCTS method with the UCB.
Update process indicates that for each round, the best selection for member area of the
truss structure is found by MCTS, and its initial state is the final state in the previous
round. Accelerators for the number of area selections and iteration number are proposed
to reduce the computing time. Also, in given state, the average reward is replaced with
the best reward collected on the simulation process to search for the best action.
Various typical truss examples including 10-bar planar truss, cantilever planar truss,
25-bar spatial truss, and 72-bar spatial truss are exhibited to validate the proposed
methodology. The explanatory examples show that this algorithm can successfully
minimize the weight of the truss structures while satisfying design constraints. The CPU
time of the UMCTS is at least ten times faster than the BB method. The numerical
results derived from this study are compared with other metaheuristic optimization
methods. It is obviously seen that the UMCTS can stably attain the optimal solutions
lighter than the other metaheuristic approaches. The proposed algorithm is not only
limited to optimizing truss structures, but it can also be used in other structural
optimization problems including shell, plate, frame, and solid structures.
References
Bonami P, Biegler LT, Conn AR, Cornuéjols G, Grossmann IE, Laird CD, Lee J, Lodi
A, Margot F, Sawaya N, Wächter A (2008) An algorithmic framework for convex mixed
integer nonlinear programs. Discret Optim 5(2):186-204
Browne CB, Powley E, Whitehouse D, Lucas SM, Cowling PI, Rohlfshagen P, Tavener
S, Perez D, Samothrakis S, Colton S (2012) A survey of Monte Carlo tree search
methods. IEEE Trans Comput Intell AI Games 4(1):1-43
Camp CV, Bichon BJ (2004) Design of space trusses using ant colony optimization. J
Struct Eng 130(5):741-751
Chhabra JPS, Warn GP (2019) A method for model selection using reinforcement
learning when viewing design as a sequential decision process. Struct Multidiscip
Optim 59(2):1521-1542
Christensen PW, Klarbring A (2009) An introduction to structural optimization.
Springer, Berlin, Germany
Duan MZ (1986) An improved Templeman's algorithm for the optimum design of
trusses with discrete member sizes. Eng Optimiz 9(4):303-312
Fu MC (2019) Simulation-based algorithms for Markov decision processes: Monte
Carlo tree search from AlphaGo to AlphaZero. Asia Pac J Oper Res 36(6):1940009
Groenwold AA, Stander N (1997) Optimal discrete sizing of truss structures subject to
buckling constraints. Struct Optim 14(2):71-80
Haftka RT, Gürdal Z (1992) Elements of structural optimization. Springer, Berlin,
Germany
Jawad FKJ, Mahmood M, Wang D, AL-Azzawi O, AL-JAMELY A (2021) Heuristic
dragonfly algorithm for optimal design of truss structures with discrete variables.
Structures 29(8):843-862
John KV, Ramakrishnan CV, Sharma KG (1987) Minimum weight design of trusses
using improved move limit method of sequential linear programming. Comput Struct
27(5):583-591
Kanno Y, Guo X (2010) A mixed integer programming for robust truss topology
optimization with stress constraints. Int J Numer Methods Eng 83(13):1675-1699
Kanno Y, Yamada H (2017) A note on truss topology optimization under self-weight
load: mixed-integer second-order cone programming approach. Struct Multidiscip
Optim 56(1):221-226
Kaveh A, Bakhshpoori T (2016) A new metaheuristic for continuous structural
optimization: water evaporation optimization. Struct Multidiscip Optim 54(1):23-43
Kocsis L, Szepesvári C (2006) Bandit based Monte-Carlo planning. In: Proceedings of
17th European conference on machine learning, pp 282-293
Kooshkbaghi M, Kaveh A, Zarfam P (2020) Different discrete ACCS algorithms for
optimal design of truss structures: A comparative study. Iran J Sci Technol-Trans Civ
Eng 44(10):49-68
Lee KS, Geem ZW, Lee SH, Bae KW (2005) The harmony search heuristic algorithm
for discrete structural optimization. Eng Optimiz 37(7):663-684
Li LJ, Huang ZB, Liu F (2009) A heuristic particle swarm optimization method for truss
structures with discrete variables. Comput Struct 87(7):435-443
Li HS, Ma YZ (2015) Discrete optimum design for truss structures by subset simulation
algorithm. J Aerosp Eng 28(4):04014091
Luo R, Wang Y, Liu Z, Xiao W, Zhao X (2022b) A reinforcement learning method for
layout design of planar and spatial trusses using kernel regression. Appl Sci-Basel
12(16):8227
Luo R, Wang Y, Xiao W, Zhao X (2022a) AlphaTruss: Monte Carlo tree search for
optimal truss layout design. Buildings-Basel 12(5):641
Sadollah A, Bahreininejad A, Eskandar H, Hamdi M (2012) Mine blast algorithm for
optimization of truss structures with discrete variables. Comput Struct 102-103:49-63
Schutte JF, Groenwold AA (2003) Sizing design of truss structures using particle
swarms. Struct Multidiscip Optim 25(4):261-269
Shahabsafa M, Mohammad-Nezhad A, Terlaky T, Zuluaga L, He S, Hwang JT, Martins
JRRA (2018) A novel approach to discrete truss design problems using mixed integer
neighborhood search. Struct Multidiscip Optim 58(4):2411-2429
Silver D, Huang A, Maddison CJ, Guez A, Sifre L, Driessche G, Schrittwieser J,
Antonoglou I, Panneershelvam V, Lanctot M, Dieleman S, Grewe D, Nham J,
Kalchbrenner N, Sutskever I, Lillicrap T, Leach M, Kavukcuoglu K, Graepel T,
Hassabis D (2016) Mastering the game of Go with deep neural networks and tree search.
Nature 529(7587):484-489
Sonmez M (2011) Discrete optimum design of truss structures using artificial bee
colony algorithm. Struct Multidiscip Optim 43(1):85-97
Stolpe M (2016) Truss optimization with discrete design variables: a critical review.
Struct Multidiscip Optim 53(2):349-374
Sutton RS, Barto AG (2018) Reinforcement learning: An introduction, second edition.
The MIT Press, Cambridge, MA
Świechowski M, Godlewski K, Sawicki B, Mańdziuk J (2023) Monte Carlo tree search:
a review of recent modifications and applications. Artif Intell Rev 56(5):2497-2562
Rajeev S, Krishnamoorthy CS (1992) Discrete optimization of structures using genetic
algorithms. J Struct Eng 118(5):1233-1250
Ramu P, Thananjayan P, Acar E, Bayrak G, Park JW, Lee I (2022) A survey of machine
learning techniques in structural and multidisciplinary optimization. Struct Multidiscip
Optim 65(9):266
Templeman AB, Yates DF (1983) A segmental method for the discrete optimum design
of structures. Eng Optimiz 6(3):145-155
Yonekura K, Kanno Y (2010) Global optimization of robust truss topology via mixed
integer semidefinite programming. Optim Eng 11(3):355-379

Update Monte Carlo Tree Search (UMCTS) Algorithm For Heuristic Global Search of Sizing Optimization Problems For Truss Structures

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Update Monte Carlo Tree Search (UMCTS) Algorithm For Heuristic Global Search of Sizing Optimization Problems For Truss Structures

Uploaded by

Copyright:

Available Formats

Update Monte Carlo tree search (UMCTS) algorithm for heuristic

global search of sizing optimization problems for truss structures

Fu-Yao Ko1・Katsuyuki Suzuki1・Kazuo Yonekura1

Sizing optimization of truss structures is a complex computational problem, and the

A truss is a two or three-dimensional structure composed of linear members joined

2. RL task for discrete structural optimization problems

2.1. Problem formulation

Minimize 𝑊 (𝐀) = 𝜌 ∑ (𝑎𝑖 ∑ 𝑙𝑖𝑗 ). (1)

In Eq. 1, 𝑊(𝐀) is the weight of the truss which is minimized. 𝐀 =

𝐃 = {𝑑1 , 𝑑2 , … , 𝑑ℎ , … , 𝑑𝑛𝑏 }, 1 ≤ ℎ ≤ 𝑛𝑏 , (2)

Fig. 1: A general member in the truss (Haftka and Gürdal 1992)

2.2. Key concepts of RL

RL is a ML technology which enables an agent to learn to complete a particular task

Fig. 2: Framework of RL (Sutton and Barto 2018)

2.3. A survey of MCTS methodology

Fig. 3: Strategic steps of the MCTS algorithm (Browne et al. 2012)

3.1. Update process

𝑝+1 𝑝+1 𝑝+1 𝑝+1

|𝐃| if |𝐃| is odd number,

largest value in a set of values 3 and 𝜅 𝑝 .

To apply MCTS search to the sizing optimization of truss structures, it is necessary to

Minimize 𝑊 (𝚪) = 𝜌 ∑ (𝛾𝑖 ∑ 𝑙𝑖𝑗 ),

𝚽 = {𝜑1 , 𝜑2 , … , 𝜑𝑖 , … , 𝜑𝑛𝑔 } is a sorted set including only 0 or 1. 0 means that the

a round, and the final state ̅̅̅

Fig. 5: Procedure for search tree creation

4.1. Problem description

𝑛𝐿 /5 𝑛𝐿 = 5, 10, 15, 20, 25,

Fig. 8: Cantilever planar truss structure

Fig. 9: A 25-bar spatial truss structure

Table 1: Load cases for the 10-bar planar truss

Table 2: Load case for the 25-bar spatial truss

Table 3: Load cases for the 72-bar spatial truss

4.2. Investigation of convergence history: 10-bar planar truss

4.3. CPU time comparisons for UMCTS algorithm and BB method

4.3.1. Perspective 1: different type of constraints and loading conditions

4.3.2. Perspective 2: the number of loadings

4.3.3. Perspective 3: the number of members

4.4. Solution accuracy

Table 10: Optimal design comparison of the 25-bar spatial truss

4.5. Solution stability

Table 13: Statistical results of the investigated example

You might also like