You are on page 1of 23

Acta Mech

https://doi.org/10.1007/s00707-023-03676-2

O R I G I NA L PA P E R

Shirko Faroughi · Ali Darvishi · Shahed Rezaei

On the order of derivation in the training of physics-informed


neural networks: case studies for non-uniform beam
structures

Received: 2 May 2023 / Revised: 26 June 2023 / Accepted: 23 July 2023


© The Author(s), under exclusive licence to Springer-Verlag GmbH Austria, part of Springer Nature 2023

Abstract The potential of the mixed formulation for physics-informed neural networks is investigated in
order to find a solution for a non-uniform beam resting on an elastic foundation and subjected to arbitrary
external loading. These types of structures are commonly encountered in civil engineering problems at larger
scales, as well as in the design of new generation meta-materials at smaller scales. The mixed formulation
approach aims to predict not only the primary variable itself but also its higher derivatives. In the context
of this work, the primary variable is the beam deflection, while its higher-order derivatives are associated
with the shear stress and moment within the beam structure. By employing this new methodology, it becomes
possible to reduce the order of the derivatives in the physical constraints used for training the neural networks.
As a result, significantly more accurate predictions can be achieved compared to the standard approach. The
obtained results are validated by comparison against available analytical solutions or those obtained from the
finite element method. Finally, the potential of the proposed method is discussed in terms of its ability to be
combined with available data in order to predict the solution for an arbitrary non-uniform beam resting on an
elastic foundation.

1 Introduction

Beams are fundamental structural elements that find wide-ranging applications in aerospace, medical, automo-
bile, construction industries, buildings, and rotating shafts. In civil, mechanical, and aeronautical engineering,
there is particular interest in structural components with variable mass and stiffness characteristics. Non-
prismatic beams, in particular, are of great significance as they are commonly employed not only to support
architectural design but also to enhance the strength of specific regions or reduce cross-sectional dimensions
for more cost-effective solutions. The behavior of beams resting on an elastic foundation has been extensively
investigated by numerous researchers. Several authors have proposed closed-form solutions to the differential
equation governing this problem [1–5]. Numerical approaches, such as the finite difference method [6] and
finite element method [7], have been employed to solve this problem and yield approximate solutions to the
differential equation, which converge to the exact solution with increasing number of nodes. Beams resting
on an elastic foundation are very often come across in engineering practice [8] and therefore an accurate and
reliable method of analysis is required especially when the properties of their cross-section are variable. The
complexity of the problem highly increases with the increase of the externally applied load since the beam’s
transverse deflection influences the axial force and the resulting equations become coupled and nonlinear [9].
S. Faroughi (B) · A. Darvishi
Faculty of Mechanical Engineering, Urmia University of Technology, Urmia, Iran
e-mail: Sh.farughi@uut.ac.ir
S. Rezaei (B)
ACCESS E.V., Intzestraße 5, Aachen, Germany
e-mail: s.rezaei@access-technology.de
S. Faroughi et al.

In this case, the linear elastic subgrade model is inadequate to describe the real behavior of the foundation
and the use of a more sophisticated nonlinear model becomes inevitable [10]. Therefore, utilizing numerical
solvers (such as finite element or deep learning methods) at this point is inevitable.
One of the emerging approaches to tackle engineering problems is the utilization of machine learning
methods. For instance, deep learning models have been successfully employed in the field of parameter iden-
tification for various types of problems, ranging in complexity. [11–13], which also includes beams and rods
[14–16]. Deep learning methods [17] play an important role in a wide range of technologies that contribute
to computer vision [18], natural language processing [19], and other data-rich areas of social interest. Despite
the increasing sophistication of data analysis and neural networks (NNs), a significant portion of this work has
not yet capitalized on the availability of large amounts of scientific data that could be used to build predictive
models. Therefore, there is a compelling need to integrate experimentally confirmed mechanical findings and
the laws of physics into classical deep learning methods. In other words, in most scientific applications, we have
access to well-established physical conservation laws (e.g., laws of momentum and energy) and mathematical
formulas (e.g., partial differential equations in fields such as solid mechanics and fluid mechanics) that can
be leveraged to enhance the training of neural networks. Emerging research explores the profound potential
of physics-based deep learning approaches, with unprecedented opportunities for scientific and engineering
advances at different scales from the molecular analysis [20] to design of novel materials with improved
performance [21, 22]. See also investigations in conducted in [1, 23–25] regarding structural and functional
applications, and describing material properties.
In order to fully unlock the potential of deep learning analyses in the context of physical systems, it is
crucial to develop methodologies that effectively address the fundamental issues governed by physical laws
and guided by mathematical formulas. In this regard, a recent advancement in the field has introduced a
physics-based deep learning approach to solve systems by explicitly incorporating the governing physical
laws, which are typically represented by partial differential equations. Traditionally, deep learning methods
have implicitly encoded these formulas by training on data generated from equations. However, this new
physics-based deep learning method takes a different approach by explicitly encoding the known physical
or scaling laws as mathematical equations within a standard neural network (NN) structure. As a result, this
approach formulates what is known as physics-informed neural networks (PINN). By explicitly incorporating
the physical laws, PINN aims to enhance the accuracy and interpretability of the deep learning models applied
to scientific problems [26]. Through such an approach, we integrate any existing knowledge expressible in
terms of PDEs into the learning process. As a result, one can improve predictability while reducing the amount
of data required to reach the desired level of accuracy. Various studies have demonstrated the utility of PINN
in addressing a large number of forward and inverse problems in subjects such as fluid mechanics [27–30]
quantum mechanics [26], and solid mechanics [1, 31–40]. These programs have shown promise for increasing
predictability when the amount of data is limited and in situations where current methods may not provide
accurate and reliable results [41]. This method is often extended to obtain unique paths for dealing with related
mathematical formulas, such as stochastic PDEs [42] and fractional PDEs [43]. For a comprehensive review
on physics-informed and physics-guided neural networks in scientific computing, readers are encouraged to
refer to [13].
Based on recent studies on physics-informed neural networks (PINN), it has been demonstrated that PINN
can be employed to analyze the bending and free vibration of three-dimensional functionally graded porous
beams. These beams possess material properties that vary continuously in three dimensions through the use
of an arbitrary function. By utilizing PINN, it becomes possible to solve the governing equations of motion
and predict the deflection of such beams. This is achieved by employing a deep feedforward neural network to
model the beam’s behavior and training the network parameters through the minimization of a loss function.
[44]. See also investigation in [45, 46] for problems related to digital materials. It is important to highlight that
PINN can be effectively applied to both homogeneous and heterogeneous problems [47]
In scientific computing, physical phenomena are often described using a strong mathematical form con-
sisting of governing differential equations as well as initial and boundary conditions. At each point inside a
domain, the strong form specifies the constraints that the solution must meet. The governing equations are
usually linear or nonlinear PDEs and/or ODEs. Some of the PDEs are notoriously challenging to solve, e.g., the
Navier–Stokes equations to explain a wide range of fluid flows [28], Foppl–von Karman equations to describe
large deflections in solids [48], etc. Other important PDE examples are heat equations [49], wave equation
[50], Burgers’ equation [51], Poisson’s equation [52], amongst others. This wealth of well-tested knowledge
can be logically leveraged to further constrain them so-called physics guided NNs while training on available
On the order of derivation in the training of physics-informed neural networks

data points [53]. To this end, mesh-free PINN have been developed [26, 53], quickly extended [54, 55], and
extensively deployed in a variety of scientific and applied fields [14, 56–59].
This work focuses on analyzing the accuracy of deep learning in predicting the solution of structural
problems where we are dealing with high-order PDEs. Moreover, we intend to show how the accuracy can be
improved by reducing the order of the involved PDEs. To achieve this goal, we will examine the equations
related to the deflection of various beam structures, which have governing equations up to the 4th order.
Moreover, we shall investigate such structural problems in prismatic and non-prismatic states. Finally, we
present the deformation and deflection of such beams. The structure of the study is as follows. In Sect. 2,
the formulation of functionally graded materials for beam structure is summarized. In Sect. 3, the proposed
architecture of the deep learning algorithm is explained. Consequently, details of this investigation as well as
utilized methods and algorithms are discussed. In Sect. 4, the numerical results are presented. Finally, the work
is concluded in Sect. 5.

2 Formulation: a non-uniform beam resting on an elastic foundation

Consider a straight conical beam element of span length L subjected to a compressive axial load P and resting
on a uniform elastic Winkler-Pasternak footing (see Fig. 1). We consider the Cartesian coordinate system
on the right, where x is the primary longitudinal axis from the left end of the beam, the y-axis is in the
lateral direction, and z-axis is considered along the thickness of the beam. The origin of these axes (O) is
located at the centroid of the cross-section. It should be noted that the Euler–Bernoulli beam assumptions
are chosen. According to this theory, the effect of shear deformation is not considered and only the effect of
bending deformation is considered in the calculation process. Based on Euler beam theory, the longitudinal
and transverse displacement components can be, respectively, expressed as:

U (x, y, z)  −zw (x), (1a)

W (x, y, z)  w(x). (1b)

Here, U denotes the axial displacement and W signifies the vertical displacement (in z direction).
If the first variation of the total potential energy vanishes, the equilibrium equations for the Euler beam
with variable cross-section are obtained as follows:
 
δ  δ U1 + U0 + U f − We  0. (2)

Here, δ illustrates a virtual variation in the last formulation. Ul represents the elastic strain energy, U0
shows the strain energy due to initial stresses. U f is the energy corresponding to uniform elastic foundation
and We denotes the work of external loads. δ could be computed using the following equation:

Fig. 1 Tapered beam resting on two-parameter elastic foundation


S. Faroughi et al.

     
L L  
δ  σx x δεlx x dAdx + σx0x δεx∗x dAdx
0 A 0 A
 
L   L
+ kw wδw + k g w  δw dx − 
qδwdx. (3)
0 0

Here, L and A express the element length and the cross-sectional area, respectively. (δεlx x , δγxl z ) and δεx∗x
are the variation of the linear and the nonlinear parts of strain tensor, respectively. The parameters kw and k g
denote Winkler elastic foundation constant and the second foundation parameter modulus in vertical direction,
respectively. Furthermore, q is the external force. σx x is axial stress and σx0x signifies initial normal stress in
the cross-section, associated with constant axial force (P):
P
σx0x  − . (4)
A
According to Green’s strain tensor, the linear and nonlinear parts of the strain relations are:

εlx x  −zw  (x), (5a)


1 
εx∗x  (w (x))2 . (5b)
2
Based on the last formulations, the variation of the strain tensor components is given by:

δεlx x  −zδw  (x), (6a)

δεx∗x  w  (x)δw  (x). (6b)

Substituting Eqs. (4) to (6) into relation (3), the expression of the virtual potential energy can be carried
out as:
 L  L  
   P 
δ  σx x zδw  (x) dAdx + − w  δw  dAdx
0 A 0 A A
 L  L
+ (kw δw + k g w  δw  )dx − qδwdx  0. (7)
0 0

The variation in strain energy can be formulated in terms of shear forces that cause buckling on the cross-
section lines of the elastic part in the configuration. The section stress resultants are presented by the following
expressions:

N σx x dA, (8a)

A

M zdA. (8b)
A

Here, N and M are the axial force applied at end member and the bending moment, respectively. Using
relations (7)–(8), the final form of the total potential energy variation (δ) is then acquired as:
L L
 
  
δ  − Mδw (x) dx − Pw  δw  dx
0 0
L L
  

+ kw wδw + k g w δw dx − qδwdx. (9)
0 0

According to the above equation, the first strain energy change includes the virtual displacement (δw) and
its derivatives. After proper combination by parts, an expression in terms of virtual displacement is obtained.
On the order of derivation in the training of physics-informed neural networks

The following equilibrium equation is obtained after basic calculations and simplification in the stationary
state:
d2 M   d2 w
 P − k g + kw w − q. (10)
dx 2 dx 2
The boundary conditions of the present beam theory can also be expressed as:
M  0 or w   0, (11a)
 
M  − P − kg w  0 (11b)
According to the Hook’s relation, the stress at a point for one-dimensional elastic materials can be written
as follows:
σx x  Eεlx x , (12)
where E is the Young’s moduli. Multiplying Eq. (12) by zd A and integration over the cross-sectional area in
the context of principal axes, the following expression is obtained:
d2 w
M(x) + E I  0. (13)
dx 2
In Eq. (13), I signifies the minor moment of inertia about y axis. By substituting Eq. (13) into Eq. (10),
the equilibrium equation can be expressed in terms of vertical displacement (w) as:
 
d2 d2 w   d2 w
E I + k g − P + kw w − q  0. (14)
dx 2 dx 2 dx 2
Extending the above equation results in:
d4 w dI d3 w d2 I d2 w   d2 w
EI + 2E + E + k g − P + kw w − q  0. (15)
dx 4 dx dx 3 dx 2 dx 2 dx 2

3 Design of the deep learning algorithm

The main goal is to utilize capabilities of deep learning algorithms to find the solution for the governing
equation to the beam structure described in Eq. (15). For this purpose, four densely connected independent
networks are designed to predict the unknowns of the problem (see Fig. 2). This design includes an input
layer and an output layer and hidden layers that are placed between the two input and output layers. Here,
we use points that are uniformly spaced along the length of the beam (x) as our values in the input layer.
The parameters that are supposed to be predicted as solutions in the output are w, θ , M, and Q, which are
deflection, slope, moment, and shear, respectively. See also investigations in [41].
There is another potential architecture that we can be used to design our network. For the outputs, a
potential design is a densely connected network to only predict the deflection (see Fig. 3). This is according
to the standard PINN where the governing equation is solely constructed based on the first and higher-order
derivatives of the primary variable. As mentioned, in this work we plan to compare the performance of these
two types of NNs.
Since we have 4 neural networks connected to reach and predict solution for output values w, θ , M and
Q, then the four partial differential equations that govern each of these networks are as follows:
dw dθ M
− θ  0, +  0, (16a)
dx dx E I
dM dQ
+ Q  0, + q(x)  0. (16b)
dx dx
We will train the networks so that their output values are as close as possible to the data, which may be real
field data or, synthetic data from the exact solution to the problem or the result of a high-fidelity simulation.
The cost function can be used for deep-learning-based solution of PDEs as well as for prediction of solution
of the output values.
S. Faroughi et al.

Fig. 2 The architecture of the proposed mixed PINN for problem of beam structures

Fig. 3 The architecture of Standard PINN for problem of beam structures

3.1 Loss function

For the current formulation, known physical laws are implemented in terms of additional loss function. These
terms include the governing equations (i.e. given ODE or PDE) as well as the given boundary conditions [26].
For the studied problems, the outputs of the neural network should satisfy the governing equation and the BCs
(Eqs. (16a) and (16b)). For the mixed formulation, we implement the beam theory into the algorithm via the
following total loss function
Ltot  Leqs + λs L BCs , (17)
where Leqs contains the contributions of the governing equations:
1
 2
n
Leqs  (Lw )2 + (Lθ )2 + (L M )2 + L Q . (18)
n
i1
Here Lw, θ , M, Q are the residual values of the four governing PDEs defined in Eqs. (16a) and (16b), n is
the number of training samples within the domain of solution. Moreover, L BCs is the residual of the BCs, λs
is the weight of loss on boundary conditions. By minimizing the total loss, all PDEs and boundary conditions
will be satisfied. Therefore, the trained neural network, can predict the approximate solution.
Since the boundary conditions can be different depending on the type of beam, we will examine the loss
error related to the boundary conditions in each problem separately and in its own section depending on the
type of beam. In Fig. 4, you can see the types of boundary conditions that we deal with in this paper.
Based on Fig. 4, the loss functions can be calculated for each of the boundary conditions. The general
forms of these functions are reported in Eqs. (19a–19d).
Lw BCs  w N N − wx0 + w N N − wxL , (19a)
Lθ BCs  θ NN
− θ x0 + θ NN
− θ xL , (19b)
L M BCs  M N N − Mx0 + M N N − MxL , (19c)
L Q BCs  Q N N − Qx0 + Q N N − QxL . (19d)
On the order of derivation in the training of physics-informed neural networks

Fig. 4 Different type of boundary conditions in beam structures

Fig. 5 Characteristics and boundary conditions of the simply supported thin beam

4 Numerical examples

In the following, we illustrate the performance of the proposed PINN method in the context of three benchmark
examples. Section 4.1 demonstrates an elastic simply supported thin beam as a classical problem. Section 4.2
focuses on an elastic non-prismatic cantilever thin beam problem. Finally, in Sect. 4.3 we present the solution
for a thin beam on an elastic foundation.

4.1 An elastic simply supported beam

The first example considers a thin elastic beam subjected to an extended sinusoidal load. The analysis is
formulated as a fourth-order partial differential equation problem. In Fig. 5, we illustrate the geometry and the
boundary conditions for this problem.
In this case, the corresponding equations are as follows:
 
d2 w d d2 w d4 w
M  −E I 2 , Q  − E I 2 , E I 4  q(x). (20)
dx dx dx dx
The exact solution of this problem can be written as
       
1 L4 πx 1 L3 πx
wtrue  −50 4 sin , θtrue  −50 3 cos , (21a)
EI π L EI π L
  πx    πx 
1 L2 1 L
Mtrue  50 2 sin , Q true  50 cos . (21b)
EI π L EI π L
S. Faroughi et al.

Finally, the boundary conditions depicted in the following:


∂ 2 w(0)
w(0)  0,  w  (0)  0, (22a)
∂x2
∂ 2 w(L)
w(L)  0,  w  (L)  0. (22b)
∂x2
Next, we examine the loss function for each of the output variables and boundary conditions of this problem.
For the parameters w, θ , M and Q, the loss functions are defined as follows:

dw
Lw  dx − θ + Lw BCs ,
(23a)

dθ M
Lθ 
dx + E I , (23b)

dM
LM 
d x + Q + L M BCs , (23c)

dQ
LQ 
d x + q(x) . (23d)

In this example we are examining a simply supported beam. Therefore, the expressions for the boundary
conditions can be calculated as follows:


Lw BCs  w N N − 0 + w N N − 0 (24a)
x0 xL
L M BCs  M N N − 0x0 + M N N − 0xL . (24b)

4.1.1 Neural network setup

The values of interest for beams are w, θ , M, and Q. Also, we introduced the input for our mixed PINN in the
form of random points on the length of the beam x. Moreover, we introduced two designs for neural network
architecture, which include standard PINN (Fig. 3) and mixed PINN (Fig. 2). For mixed PINN, we have four
partial differential equations that we mentioned in Eqs. (16a) and (16b). Next, we train the network in such
a way that the solution for the parameters of the problem is equal to the exact values or close to these values
with a very low error percentage. By comparing these values with each other, we find out the accuracy of our
network.

4.1.2 Predicted solutions

Here, we use PINN to find solution for the model outputs w, θ , M and Q. As explained earlier, our features
include random points on the length of the beam, which for this problem includes 10,000 points that are uni-
formly distributed. We study how the accuracy of the solution process and whether we use several independent
networks for the different quantities are of interest. To study the impact of the architecture and functional form
of the PINN, we use 4 different networks with 4 hidden layers, and 20 neurons per layer. In this paper, the
activation function used in the neural network is swish, which is defined as follows:
x
swish(x)  x · sigmoid(βx)  . (25)
1 + e−βx
Swish is a non-monotonic function with a smooth curve which consistently matches or outperforms the
accuracy of Rectified Linear Unit (ReLU) function and β is either a constant or trainable parameter. Figure 6
shows the graph of Swish for different values of β.
It is worth mentioning that the function Swish allows small negative values for weights to be propagated
through, while Rectified Linear Unit (ReLU) thresholds all negative weights to zero. This is an important
property and is crucial in the success of non-monotonic smooth activation functions, like that of Swish, when
used in deep neural networks. Moreover, a suitable choice of activation functions allows for better tuning of
the network’s parameters by maximizing information propagation and pushing for smoother gradients. The
latter makes the landscape easier to optimize, thus generalizing better and faster. The Swish activation function
is usually used for networks with a high number of hidden layers. But according to the results obtained in our
On the order of derivation in the training of physics-informed neural networks

Fig. 6 Swish function for different values of β

Fig. 7 Comparison the obtained numerical solution for deflection and slope with exact values using the mixed formulation

studies and the comparisons made, we observed that swish can be used for neural networks with a smaller
number of hidden layers and provides acceptable results.
If β  1, Swish is equivalent to the Sigmoid-weighted Linear Unit (SiL) of Elfwing et al. [60] that was
proposed for reinforcement learning. If β  0, Swish becomes the scaled linear function, i.e., swish(x)  x2 .
As β → ∞, the sigmoid component approaches a 0 − 1 function, so Swish becomes like the Rectified Linear
Unit (ReLU) function. In this case, Swish can be freely viewed as a smooth function that interpolates between
the linear function and the Rectified Linear Unit function [61].
After minimizing the total loss function for the neural network using the mixed formulation, one obtained
the values for the variables w, θ , M and Q via four densely connected independent networks. Note that each
of these networks only is responsible for predicting one of these variables (idea of the mixed PINN). Next, we
compare the obtained solution with the exact values. Figures 7 and 8 show the comparison of solution with
exact values which are in quite acceptable agreement.
Figure 9 shows the evolution of the total loss term as a function of epoch numbers.
In Fig. 10, one can see the loss related to each output variable through the minimization process.
By comparing the numerical solution with the exact values, the error corresponding to w is err or w 
0.133%, error corresponding to θ is err or θ  0.176%, error corresponding to M is err or M  0.004% and
error corresponding to Q is err or Q  0.03%. In Fig. 11, one can also see the loss terms related to each of the
boundary conditions.
In Fig. 12, we investigate how the total loss will evolve if instead of four densely connected independent
networks associated with w, θ , M and Q (mixed PINN, see Fig. 2), we use a single densely connected network
to predict only the values for the beam deflection w (standard PINN, see Fig. 3).
Based on Fig. 12, the neural network architecture that we have considered in our mixed PINN formulation
has better accuracy and performance than the standard PINN architecture. For a better comparison, in Fig. 13,
we compare the deflection obtained from both formulation which clearly indicates that the mixed PINN
formulation has a better performance.
S. Faroughi et al.

Fig. 8 Comparison the obtained numerical solution for M and Q with exact values using the mixed formulation

Fig. 9 The value of overall loss in each epoch for the mixed PINN formulation

Fig. 10 The loss of all the output values that are defined in the mixed PINN formulation
On the order of derivation in the training of physics-informed neural networks

Fig. 11 The loss of all boundary condition for the mixed PINN formulation

Fig. 12 Overall loss of mixed PINN and the standard PINN formulations

The neural network error value in standard PINN formulation architecture is equal to err or w  5.69%,
which is significantly higher than error value of the mixed PINN formulation architecture.
In Fig. 14, we examine the time spent for each epoch. Based on the obtained results, the amount of time
spent for training the neural network in our mixed PINN formulation is much less compared to the time spent
for training the neural network with standard PINN formulation. The higher computational time in standard
PINN can be due to the existence of the fourth derivative.

4.2 An elastic non-prismatic cantilever thin beam

The third example considers an elastic non-prismatic cantilever thin beam subjected to a uniformly distributed
load. The analysis is formulated as a fourth-order partial differential equation problem. Figure 15 illustrates
the geometry and the boundary conditions for this problem.
In this example, the corresponding equations are as follows:
    
d2 d2 w (L − x) 2 bh 3
E I (x) + q(x) · b  0, h  h 0 1 + , I (x)  , (26a)
dx 2 dx 2 L 12
   2 
d4 w dI (x) d3 w d I (x) d2 w
E I (x) · + 2E · + E · + q(x) · b  0. (26b)
dx 4 dx dx 3 dx 2 dx 2
S. Faroughi et al.

Fig. 13 Comparison the solution for deflection in Mixed PINN formulation and Standard PINN formulation with exact values of
deflection

Fig. 14 Comparison of computational time spent per epoch for the mixed PINN formulation and the standard PINN formulation

Fig. 15 Characteristics and boundary conditions of the elastic non-prismatic cantilever thin beam
On the order of derivation in the training of physics-informed neural networks

The exact solution of this problem is


  
3π 3 · Ar ctan(x − 1) 3x · Ar ctan(x − 1) 3π x 3 3
wtrue  + − − −   + ,
10880 2720 2720 10880 2720 x − 2x + 2
2 5440
  (27a)
3(2x − 2) 3Ar ctan(x − 1) 3π 3x
θtrue  − − −  , (27b)
2720(x 2 − 2x + 2)2 2720 10880 2720 (x − 1)2 + 1
x2 1
Mtrue  − x + , Q true  −(1 − x). (27c)
2 2
The boundary conditions depicted in the following:
∂w(0)
w(0)  0,  w  (0)  0, (28a)
∂x
∂ 2 w(L)  ∂ 3 w(L)
 w (L)  0,  w  (L)  0. (28b)
∂x2 ∂x3
Just like the previous example, in this problem we will also look at the loss functions related to the problem
parameters and boundary conditions:

dw
Lw  − θ + Lw BCs , (29a)
dx
dθ M
Lθ 
dx + E I (x) + Lθ BCs , (29b)

dM
LM 
dx + Q + L M BCs , (29c)

dQ
LQ 
dx + q(x) + L Q BCs . (29d)

Considering that the beam under investigation is no longer a simply-supported beam, hence the boundary
conditions have also changed. The loss functions related to these boundary conditions for a non-prismatic
cantilever beam are as follows:


Lw BCs  w N N − 0 , (30a)
x0

Lθ BCs  θ N N − 0 , (30b)
x0

L M BCs  M N N − 0 , (30c)
xL

L Q BCs  Q N N − 0 . (30d)
xL

4.2.1 Neural network setup

The values of interest in this problem are the same as the previous example, which includes w, θ , M and Q.
Similar to the previous case, we have four densely connected independent networks with only one output each,
associated with w, θ , M and Q (mixed PINN, Fig. 2). The four partial differential equations that govern each
of these networks are as follows:
dw dθ M
− θ  0, +  0, (31a)
dx dx E I (x)
dM dQ
+ Q  0, + q(x)  0. (31b)
dx dx
By paying attention to Eqs. (31a) and (31b), one notices that the only difference in the existing partial
differential equations compared to the previous example is the variable value of I , which is caused by the
non-uniformity of the investigated beam.
S. Faroughi et al.

Fig. 16 Comparison of the solution for deflection and slope with exact values

Fig. 17 Comparison of the solution for M and Q with exact values

4.2.2 Predicted solutions

The idea is to use PINN to predict output values w, θ , M and Q. We define 10,000 collocation points that are
uniformly distributed. as we defined our neural network architecture in Fig. 2, we use 4 different networks
with 4 hidden layers, and 20 neurons per layer. In this example the activation function is swish (Eq. (25)). We
use this activation function for all the following studies. After minimizing all the corresponding loss functions,
we compare these solutions with exact values. Figures 16 and 17 show the comparison of obtained solution
with exact values. In this second example, we have used 6000 epochs in training our neural network (see also
Fig. 18).
As what we do for all examples in this paper, we have entered the boundary conditions along with 4 partial
differential in Eqs. (31a) and (31b) as governing equations into the neural network. In Fig. 19, one can see the
loss function in line with the epoch number increase.
By comparing the solution with the exact values, the error corresponding to w is err or w  3.27%, error
corresponding to θ is err or θ  2.83%, error corresponding to M is err or M  0.002% and error corresponding
to Q is err or Q  0.0085%. In Fig. 20, you can also see the error related to each of the boundary conditions.
In Fig. 21, as in the previous example, we will examine the evolution of total loss if instead of four densely
connected independent networks associated with w, θ , M and Q (mixed PINN, see Fig. 2), we use a single
densely connected network to predict only the values for the beam deflection w (standard PINN, see Fig. 3).
Based on Fig. 21, the neural network architecture that we have considered in our mixed PINN formulation
has significantly better accuracy and higher performance than the standard PINN formulation. In Fig. 22,
we compare the deflection in mixed PINN formulation and the standard PINN formulation with exact values
On the order of derivation in the training of physics-informed neural networks

Fig. 18 The value of overall loss in each epoch for the mixed PINN formulation

Fig. 19 The loss of all the output values that are defined in the mixed PINN formulation

Fig. 20 The loss of all boundary condition for the mixed PINN formulation
S. Faroughi et al.

Fig. 21 Comparison overall loss of mixed PINN formulation with overall loss of the standard PINN formulation for elastic
non-prismatic cantilever thin beam

Fig. 22 Comparison the solution for deflection in mixed PINN formulation and the standard PINN formulation with exact values
of deflection for elastic non-prismatic cantilever thin beam

for elastic non-prismatic cantilever thin beam, based on this comparison, we realize that the architecture we
have considered for our mixed PINN formulation results into a significantly better accuracy under the same
circumstances.
The error resulting from the standard PINN formulation is equal to err or w  114.36%, which as expected
has more error than the mixed PINN formulation. In Fig. 23, we examine the time spent for each epoch for the
second example (elastic non-prismatic cantilever thin beam). Based on this figure, the amount of time spent
for training the neural network in our mixed PINN formulation is much less compared to the time spent for
training the neural network with standard PINN architecture.

4.3 An elastic non-prismatic cantilever beam resting on a Winkler layer

The third example considers an elastic non-prismatic cantilever thin beam resting on a Winkler springs layer,
subjected to a uniformly distributed load. The analysis is formulated as a fourth-order partial differential
equation problem. Figure 24 illustrates the geometry and the boundary conditions for this problem.
On the order of derivation in the training of physics-informed neural networks

Fig. 23 Comparison time spent per epoch in the mixed PINN formulation with the standard PINN formulation for elastic non-
prismatic cantilever thin beam

Fig. 24 Characteristics and boundary conditions of the elastic non-prismatic cantilever thin beam which resting on a Winkler
layer (K w )

In this example, the corresponding equations are as follows:


    
d2 d2 w (L − x) 2
E I (x) 2 + k.w.b  q(x).b, h  h 0 1 + , (32a)
dx 2 dx L
bh 3
I (x)  . (32b)
12
In this example, unlike the previous problems where we had exact values, we check the solution of this
structure with two numerical approaches. The first approach is to use standard finite element method using the
Abaqus software. The second approach is to use the mixed PINN, which is introduced in the previous sections.
Finally, we intend to compare the obtained solution from each of these methodologies and check the error at
each point. The boundary conditions depicted in the following:

w(0)  0, (33a)

∂w(0) ∂ 2 w(L)
 w  (0)  0,  w  (L)  0, (33b)
∂x ∂x2
∂ 3 w(L)
 w  (L)  k · w(L) · b. (33c)
∂x3
S. Faroughi et al.

4.3.1 Neural network setup

The values of interest in this problem are the same as before, which includes w, θ , M and Q. Again, we have
four densely connected independent networks with only one output each, associated with w, θ , M and Q (see
mixed PINN in Fig. 3). The four partial differential equations that govern each of these networks are as follows:
dw dθ M
− θ  0, +  0, (34a)
dx dx E I (x)
dM dQ
+ Q  0, + q(x) · b + k · w · b  0. (34b)
dx dx
By comparing the partial differential equations presented in this section with the previous problem, one
can see that additional parameters are added (k · w · b), which are due to the existence of a Winkler layer.
Therefore, for this example, the loss functions for each of the parameters are also defined as follows:

dw
Lw 
dx − θ + Lw BCs, (35a)

dθ M
Lθ 
dx + E I (x) + Lθ BCs , (35b)

dM
LM 
dx + Q + L M BCs, (35c)

dQ
LQ 
dx + q(x) · b + k · w · b + L Q BCs. (35d)

Furthermore, the loss functions for the boundary conditions of this problem are according to the following
relations.


Lw BCs  w N N − 0 , (36a)
x0


Lθ BCs  θ N N − 0 , (36b)
x0


L M BCs  M N N − 0 , (36c)
xL


L Q BCs  Q N N − k · w · b . (36d)
xL

4.3.2 Predicted solutions

We utilize the mixed PINN formulation to predict the solution for the model outputs w, θ , M and Q. Our inputs
are the position of collocation points include 1000 points that are uniformly distributed along the length of the
beam. The neural network architecture for this problem is exactly like the previous examples. Furthermore, the
activation function used is swish (Eq. (25)). Next, we minimize the loss functions for four densely connected
independent networks. To shorten the content, we compare the solution through the mixed PINN formulation
directly with the data extracted via Abaqus calculation at each point. Later, we check the difference of the
output parameters at each point between finite element analysis and neural networks training. In Figs. 25, 26,
27 and 28, we report the values extracted through Abaqus software as well as the mixed PINN formulation for
the deflection, rotation, moment, and shear force distribution, respectively.
Figures 25, 26, 27 and 28, arranged from left to right, illustrate the output parameters obtained from both the
Abaqus software and the Mixed PINN model. Additionally, these figures showcase the absolute error between
the two. Specifically, the presented figures depict the absolute error obtained by comparing the parameter
values from the Mixed PINN model with the corresponding output obtained from the Abaqus software at each
data point.
The amount of error detected for the mixed PINN results compared to Abaqus results for the parameter w
is equal to 0.049%. We perform the same calculations for θ , M, and Q parameters, and the error corresponding
to each parameter is equal to 0.05%, 5.07e−06%, 4.93e−06%.
On the order of derivation in the training of physics-informed neural networks

Fig. 25 Values of the deflection w(x) obtained through Abaqus software and the mixed PINN formulation

Fig. 26 values of the rotation θ(x) obtained through Abaqus software and the mixed PINN formulation

Fig. 27 Values of the moment M(x) obtained through Abaqus software and the mixed PINN formulation

4.4 Deflection of a heterogeneous cantilever thin beam

This section focuses on the examination of an elastic heterogeneous cantilever thin beam that is exposed
to a uniformly distributed load. Figure 29 shows the geometric configuration of the beam along with the
prescribed boundary conditions that define the problem. Here the Young’s modulus is varied according to
E(x)  3400.e(4x) . The rest of the governing equations, boundary conditions and loss term are like those
described in the previous sections.
The values of interest in this problem are w, θ , M and Q. we have four densely connected independent
networks with only one output each, associated with w, θ , M and Q (see the formulation for mixed PINN in
Fig. 2). The four partial differential equations that govern each of these networks are as follows:
We define 1000 collocation points that are uniformly distributed. We use 4 different networks with 4 hidden
layers, and 20 neurons per layer. The activation function is swish. After minimizing all the corresponding loss
S. Faroughi et al.

Fig. 28 values of the shear force Q(x) obtained through Abaqus software and the mixed PINN formulation

Fig. 29 Characteristics and boundary conditions of the heterogeneous beam

Fig. 30 Comparison of the solutions between mixed-PINN and FEM for deflection and

functions, we compare these solutions with our solution that we obtained through finite element method.
Figures 30, 31, 32, 33 show the comparison of obtained solution with finite element solution. In this example,
we have used 6000 epochs in training our neural network.
By comparing the solution with the FEM solution, the error corresponding to w iserr or w  3.27%,
error corresponding to θ is err or θ  2.83%, error corresponding to M is err or M  3.002% and error
corresponding to Q is err or Q  0.067%.

5 Conclusion

Our motivation in carrying out these studies is a detailed examination regarding the accuracy and computational
time in a mixed formulation of PINN formulations and standard PINN formulations when it comes to higher-
On the order of derivation in the training of physics-informed neural networks

Fig. 31 Comparison of the solutions between mixed-PINN and FEM for M and Q

Fig. 32 The value of overall loss in each epoch for the mixed PINN formulation

Fig. 33 The loss of all the output values that are defined in the mixed PINN formulation

order differential equations. Specific problems and examples in this study are devoted to beam structures
under various types of complicated boundary and foundation conditions. We examined a few examples in
Sects. 4.1, 4.2, 4.3, where different modes, boundaries, and loading conditions are considered for a thin
beam structure. The results of different network architectures are compared against those from analytical and
numerical calculations. We compare the performance of the proposed mixed formulation for PINN against
the standard version of PINN. In more complicated scenarios where the analytical solution is not available,
we compared the solution from deep learning with those obtained from Abaqus software. In all the reported
results, we observed a very good agreement between different approaches.
S. Faroughi et al.

The final point emphasizes on the potential of the proposed method in accurately predicting desired solutions
for beam structures subjected to complex resting and loading conditions. A significant finding from this study
is the importance of reducing the order of derivatives when constructing the loss functions. In all investigated
problems, the mixed PINN formulation (utilizing first-order derivatives) consistently yields more accurate
results compared to the standard PINN version (employing second and fourth-order derivatives). Additionally,
the mixed formulation requires less computational time during training. The increased computational time in
the standard PINN formulation can be attributed to the computation of the fourth derivative, which is more
challenging. For future work, a promising direction would be to combine data and physics, incorporating
operator learning techniques to address structural equations under arbitrary loading and boundary conditions.
Furthermore, the application of the proposed method to beam structures with significant heterogeneity in
properties holds great interest and worth further exploration.

References

1. Adeli, H.: Neural networks in civil engineering: 1989–2000. Comput.-Aided Civ. Infrastruct. Eng. 16(2), 126–142 (2001)
2. Eisenberger, M., Yankelevsky, D.Z.: Exact stiffness matrix for beams on elastic foundation. Comput. Struct. 21(6), 1355–1359
(1985)
3. Hetenyi, M.: Beams on Elastic Foundation, p. 1946. The University of Michigan Press, Ann Arbor (1946)
4. Miranda, C., Nair, K.: Finite beams on elastic foundation. J. Struct. Div. 92(2), 131–142 (1966)
5. Soltani, M., Asgarian, B., Mohri, F.: Improved finite element model for lateral stability analysis of axially functionally graded
nonprismatic I-beams. Int. J. Struct. Stab. Dyn. 19(09), 1950108 (2019)
6. Zdravković, N.B., et al.: The determination of the deflection of the beam with continuously varying cross-section by the
finite difference method. IMK-14-Istrazivanje i razvoj (2020)
7. Mercuri, V., et al.: Structural analysis of non-prismatic beams: Critical issues, accurate stress recovery, and analytical
definition of the Finite Element (FE) stiffness matrix. Eng. Struct. 213, 110252 (2020)
8. Öz, H., Pakdemirli, M.: Two-to-one internal resonances in a shallow curved beam resting on an elastic foundation. Acta
Mech. 185(3), 245–260 (2006)
9. Katsikadelis, J.T., Tsiatas, G.: Large deflection analysis of beams with variable stiffness. Acta Mech. 164(1), 1–13 (2003)
10. Tsiatas, G.: Nonlinear analysis of non-uniform beams on nonlinear elastic foundation. Acta Mech. 209(1), 141–152 (2010)
11. Zhu, S., Ohsaki, M., Guo, X.: Prediction of non-linear buckling load of imperfect reticulated shell using modified consistent
imperfection and machine learning. Eng. Struct. 226, 111374 (2021)
12. Muther, T., et al.: Physical laws meet machine intelligence: current developments and future directions. Artif. Intell. Rev.
p. 1–67 (2022)
13. Faroughi, S.A., et al.: Physics-Guided, Physics-Informed, and Physics-Encoded Neural Networks in Scientific Computing.
arXiv preprint arXiv:2211.07377 (2022)
14. Katsikis, D., Muradova, A.D., Stavroulakis, G.E.: A gentle introduction to physics-informed neural networks, with applica-
tions in static rod and beam problems. J. Adv. Appl. Comput. Math. 9, 103–128 (2022)
15. Pilania, G., et al.: Accelerating materials property predictions using machine learning. Sci. Rep. 3(1), 1–6 (2013)
16. Bazmara, M., Silani, M., Mianroodi, M.: Physics-informed neural networks for nonlinear bending of 3D functionally graded
beam. In: Structures. Elsevier (2023)
17. LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553): 436–444 (2015)
18. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Commun. ACM
60(6), 84–90 (2017)
19. Hinton, G., et al.: Deep neural networks for acoustic modeling in speech recognition: the shared views of four research
groups. IEEE Signal Process. Mag. 29(6), 82–97 (2012)
20. Butler, K.T., et al.: Machine learning for molecular and materials science. Nature 559(7715), 547–555 (2018)
21. Shi, Z., et al.: Metallization of diamond. Proc. Natl. Acad. Sci. 117(40), 24634–24639 (2020)
22. Shi, Z., et al.: Deep elastic strain engineering of bandgap through machine learning. Proc. Natl. Acad. Sci. 116(10), 4117–4122
(2019)
23. Cha, Y.J., Choi, W., Büyüköztürk, O.: Deep learning-based crack damage detection using convolutional neural networks.
Comput.-Aided Civ. Infrastruct. Eng. 32(5), 361–378 (2017)
24. Lu, L., et al.: Extraction of mechanical properties of materials through deep learning from instrumented indentation. Proc.
Natl. Acad. Sci. 117(13), 7052–7062 (2020)
25. Yin, M., et al.: Simulating progressive intramural damage leading to aortic dissection using an operator-regression neural
network. arXiv preprint arXiv:2108.11985 (2021)
26. Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: a deep learning framework for solving
forward and inverse problems involving nonlinear partial differential equations. J. Comput. Phys. 378, 686–707 (2019)
27. Cai, S., et al.: Artificial intelligence velocimetry and microaneurysm-on-a-chip for three-dimensional analysis of blood flow
in physiology and disease. Proc. Natl. Acad. Sci. 118(13), e2100697118 (2021)
28. Cai, S., et al.: Physics-informed neural networks (PINNs) for fluid mechanics: a review. Acta Mech. Sin., pp. 1–12 (2022)
29. Raissi, M., Yazdani, A., Karniadakis, G.E.: Hidden fluid mechanics: learning velocity and pressure fields from flow visual-
izations. Science 367(6481), 1026–1030 (2020)
30. Chiniforooshan Esfahani, I.: A Data-Driven Physics-Informed Neural Network for Predicting the Viscosity of Nanofluids.
AIP Adv. 13(2) (2023)
On the order of derivation in the training of physics-informed neural networks

31. Fuhg, J.N., Bouklas, N.: The mixed deep energy method for resolving concentration features in finite strain hyperelasticity.
J. Comput. Phys. 451, 110839 (2022)
32. Goswami, S., et al.: A physics-informed variational DeepONet for predicting the crack path in brittle materials. arXiv preprint
arXiv:2108.06905 (2021)
33. Rao, C., Sun, H., Liu, Y.: Physics-informed deep learning for computational elastodynamics without labeled data. J. Eng.
Mech. 147(8), 04021043 (2021)
34. Samaniego, E., et al.: An energy approach to the solution of partial differential equations in computational mechanics via
machine learning: concepts, implementation and applications. Comput. Methods Appl. Mech. Eng. 362, 112790 (2020)
35. Shukla, K., et al.: Physics-informed neural network for ultrasound nondestructive quantification of surface breaking cracks.
J. Nondestr. Eval. 39(3), 1–20 (2020)
36. Yin, M., et al.: Non-invasive inference of thrombus material properties with physics-informed neural networks. Comput.
Methods Appl. Mech. Eng. 375, 113603 (2021)
37. Bai, J., et al.: A physics-informed neural network technique based on a modified loss function for computational 2D and 3D
solid mechanics. Comput. Mech. 71(3), 543–562 (2023)
38. Kapoor, T., et al.: Physics-informed neural networks for solving forward and inverse problems in complex beam systems.
arXiv preprint arXiv:2303.01055 (2023)
39. Roy, A.M., et al.: Deep learning-accelerated computational framework based on physics informed neural network for the
solution of linear elasticity. Neural Netw. 162, 472–489 (2023)
40. Abueidda, D.W., et al.: Enhanced physics-informed neural networks for hyperelasticity. Int. J. Numer. Meth. Eng. 124(7),
1585–1601 (2023)
41. Rezaei, S., et al.: A mixed formulation for physics-informed neural networks as a potential solver for engineering problems
in heterogeneous domains: comparison with finite element method. Comput. Methods Appl. Mech. Eng. 401, 115616 (2022)
42. Yang, L., Zhang, D., Karniadakis, G.E.: Physics-informed generative adversarial networks for stochastic differential equa-
tions. SIAM J. Sci. Comput. 42(1), A292–A317 (2020)
43. Pang, G., Lu, L., Karniadakis, G.E.: fPINNs: fractional physics-informed neural networks. SIAM J. Sci. Comput. 41(4),
A2603–A2626 (2019)
44. Fallah, A., Aghdam„ M.M.: Physics-informed neural network for bending and free vibration analysis of three-dimensional
functionally graded porous beam resting on elastic foundation. Eng. Comput. pp. 1–18 (2023)
45. Aygun, A., Maulik, R., Karakus, A.: Physics-informed neural networks for mesh deformation with exact boundary enforce-
ment. arXiv preprint arXiv:2301.05926 (2023)
46. Zhang, Z., Gu, G.X.: Physics-informed deep learning for digital materials. Theor. Appl. Mech. Lett. 11(1), 100220 (2021)
47. Harandi, A., et al.: Mixed formulation of physics-informed neural networks for thermo-mechanically coupled systems and
heterogeneous domains. arXiv preprint arXiv:2302.04954 (2023)
48. Randjbaran, E., et al.: A review paper on comparison of numerical techniques for finding approximate solutions to boundary
value problems on post-buckling in functionally graded materials. Trends J. Sci. Res. 2(1), 1–6 (2015)
49. Triebel, H.: Hybrid function spaces, heat and Navier–Stokes equations (2015)
50. Durran, D.R.: Numerical Methods for Wave Equations in Geophysical Fluid Dynamics, vol. 32. Springer, Berlin (2013)
51. Da Prato, G.: Kolmogorov Equations for Stochastic PDEs. Springer (2004)
52. Genovese, L., et al.: Efficient solution of Poisson’s equation with free boundary conditions. J. Chem. Phys. 125(7), 074105
(2006)
53. Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics Informed Deep Learning (Part I): Data-Driven Solutions of Nonlinear
Partial Differential Equations. arXiv preprint arXiv:1711.10561 (2017)
54. Jagtap, A.D., Karniadakis, G.E.: Extended physics-informed neural networks (XPINNs): a generalized space-time domain
decomposition based deep learning framework for nonlinear partial differential equations. In: AAAI Spring Symposium:
MLPS (2021)
55. Jagtap, A.D., Kharazmi, E., Karniadakis, G.E.: Conservative physics-informed neural networks on discrete domains for
conservation laws: applications to forward and inverse problems. Comput. Methods Appl. Mech. Eng. 365, 113028 (2020)
56. Mahmoudabadbozchelou, M., Jamali, S.: Rheology-Informed Neural Networks (RhINNs) for forward and inverse metamod-
elling of complex fluids. Sci. Rep. 11(1), 1–13 (2021)
57. Nguyen, L., Raissi, M., Seshaiyer, P.: Modeling, Analysis and Physics Informed Neural Network approaches for studying
the dynamics of COVID-19 involving human-human and human-pathogen interaction. Comput. Math. Biophys. 10(1), 1–17
(2022)
58. Salvati, E., et al.: A defect-based physics-informed machine learning framework for fatigue finite life prediction in additive
manufacturing. Mater. Des. 222, 111089 (2022)
59. Shaier, S., Raissi, M., Seshaiyer, P.: Data-driven approaches for predicting spread of infectious diseases through DINNs:
disease informed neural networks. Lett. Biomath. 9(1), 71–105–71–105 (2022)
60. Elfwing, S., Uchibe E., Doya, K.: Sigmoid-weighted linear units for neural network function approximation in reinforcement
learning. Neural Netw. 107, 3–11 (2018). https://doi.org/10.1016/j.neunet.2017.12.012.
61. Ramachandran, P., Zoph, B., Le, Q.V.: Searching for activation functions. arXiv preprint arXiv:1710.05941 (2017)

Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional
affiliations.

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement
with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely
governed by the terms of such publishing agreement and applicable law.

You might also like