Computational Mechanics With Deep Learning: Genki Yagawa Atsuya Oishi

Lecture Notes on Numerical Methods
in Engineering and Sciences
Genki Yagawa
Atsuya Oishi
Computational
Mechanics
with Deep
Learning
An Introduction
Lecture Notes on Numerical Methods
in Engineering and Sciences
Series Editor
Eugenio Oñate , Jordi Girona, 1, Edifici C1 - UPC, Universitat Politecnica de
Catalunya, Barcelona, Spain
Editorial Board
Charbel Farhat, Department of Mechanical Engineering, Stanford University,
Stanford, CA, USA
C. A. Felippa, Department of Aerospace Engineering Science, University of
Colorado, College of Engineering & Applied Science, Boulder, CO, USA
Antonio Huerta, Universitat Politècnica de Cataluny, Barcelona, Spain
Thomas J. R. Hughes, Institute for Computational Engineering, University of Texas
at Austin, Austin, TX, USA
Sergio Idelsohn, CIMNE - UPC, Barcelona, Spain
Pierre Ladevèze, Ecole Normale Supérieure de Cachan, Cachan Cedex, France
Wing Kam Liu, Evanston, IL, USA
Xavier Oliver, Campus Nord UPC, International Center of Numerical Methods,
Barcelona, Spain
Manolis Papadrakakis, National Technical University of Athens, Athens, Greece
Jacques Périaux, CIMNE - UPC, Barcelona, Spain
Bernhard Schrefler, Mechanical Sciences, CISM - International Centre for
Mechanical Sciences, Padua, Italy
Genki Yagawa, School of Engineering, University of Tokyo, Tokyo, Japan
Mingwu Yuan, Beijing, China
Francisco Chinesta, Ecole Centrale de Nantes, Nantes Cedex 3, France
This series publishes text books on topics of general interest in the field of
computational engineering sciences.
The books will focus on subjects in which numerical methods play a fundamental
role for solving problems in engineering and applied sciences. Advances in finite
element, finite volume, finite differences, discrete and particle methods and their
applications to classical single discipline fields and new multidisciplinary domains
are examples of the topics covered by the series.
The main intended audience is the first year graduate student. Some books define
the current state of a field to a highly specialised readership; others are accessible to
final year undergraduates, but essentially the emphasis is on accessibility and clarity.
The books will be also useful for practising engineers and scientists interested in
state of the art information on the theory and application of numerical methods.
Genki Yagawa · Atsuya Oishi
Computational Mechanics
with Deep Learning
An Introduction
Genki Yagawa Atsuya Oishi
Professor Emeritus Graduate School of Technology
University of Tokyo and Toyo University Industrial and Social Sciences
Tokyo, Japan Tokushima University
Tokushima, Japan
ISSN 1877-7341 ISSN 1877-735X (electronic)

Lecture Notes on Numerical Methods in Engineering and Sciences
ISBN 978-3-031-11846-3 ISBN 978-3-031-11847-0 (eBook)
https://doi.org/10.1007/978-3-031-11847-0
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature
Switzerland AG 2023
This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether
the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse
of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and
transmission or information storage and retrieval, electronic adaptation, computer software, or by similar
or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors, and the editors are safe to assume that the advice and information in this book
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or
the editors give a warranty, expressed or implied, with respect to the material contained herein or for any
errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.
This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface
Computational Mechanics
It is well known that various physical, chemical, and mechanical phenomena in

nature and the behaviors of artificially created structures and devices are described
by partial differential equations. While partial differential equations are rarely solved
by analytical methods except those under idealized and special conditions, such
numerical methods as the finite element method (FEM), the finite difference method
(FDM), and the boundary element method (BEM) can approximately solve most
partial differential equations using a grid or elements that spatially subdivide the
object. Thus, the development of these numerical methods has been a central issue
in computational mechanics. In these methods, the fineness of the grid or elements is
directly related to the accuracy of the solution. Therefore, many research resources
have been devoted to solving larger simultaneous equations faster, and together with
the remarkable advances in computers, it has become possible to solve very large
problems that were considered to be unsolvable some decades ago.
Nowadays, it has become possible to analyze a variety of complex phenomena,
expanding the range of applications of computational mechanics.
Deep Learning
On the other hand, the advances of computers have brought about significant devel-
opments in machine learning, which aims classifying and making decisions by the
process of finding inherent rules and trends in large amounts of data based on
algorithms rather than human impressions and intuition.
Feedforward neural networks are one of the most popular machine learning algo-
rithms. They have the ability to approximate arbitrary continuous functions and have
been applied to various fields since the development of the error back propagation
learning in 1986. Since the beginning of 21st century, they have become able to
v
vi Preface
use many hidden layers, called deep learning. Their areas of applications have been
further expanded due to the performance improvement by using more hidden layers.
Computational Mechanics with Deep Learning
Although the development of computational mechanics including the FEM has made
it possible to analyze various complex phenomena, there still remain many problems
that are difficult to deal with. Specifically, numerical solution methods such as the
FEM are solid solution methods based on mathematical equations (partial differential
equations), so they are useful when finding solutions to partial differential equations
based on given boundary and initial conditions. However, it is not the case when
estimating boundary and initial conditions from the solutions. In fact, the latter is
often encountered in the design phase of artifacts.
In addition, as deep learning and neural networks can discover mapping relations
between data without explicit mathematical formulas, it is possible to find inverse
mappings only by swapping the input and output. For this reason, deep learning and
neural networks have been accepted in the field of computational mechanics as an
important method to deal with the weak points of conventional numerical methods
such as the FEM.
They were mainly applied to such limited areas as the estimation of constitu-
tive laws of nonlinear materials and non-destructive evaluation, but with the recent
development of deep learning, their applicability has been expanded dramatically. In
other words, a fusion has started between deep learning and computational mechanics
beyond the conventional framework of computational mechanics.
Readership
The authors’ previous book titled Computational Mechanics with Neural Networks
published in 2021 from Springer covers most of the applications of neural networks
and deep learning in computational mechanics from its early days to the present
together with applications of other machine learning methods. Its concise descrip-
tions of individual applications make it suitable for researchers and engineers to get
an overview of this field.
On the other hand, the present book, Computational Mechanics with Deep
Learning: An Introduction, is intended to select carefully some recent applications of
deep learning and to discuss each application in detail, but in an easy-to-understand
manner. Sample programs are included for the readers to try out in practice. This
book is therefore useful not only for researchers and engineers, but also for a wide
range of readers who are interested in this field.
Preface vii
Structure of This book
The present book is written from the standpoint of integrating computational

mechanics and deep learning, consisting of three parts: Part I (Chaps. 1–3) covers the
basics, Part II (Chaps. 4–8) covers several applications of deep learning to computa-
tional mechanics with detailed descriptions of the fields of computational mechanics
to which deep learning is applied, and Part III (Chaps. 9–10) describes programming,
where the program codes for both computational mechanics and deep learning are
discussed in detail. The authors have tried to make the program not a black box, but
a useful tool for readers to fully understand and handle the processing. The contents
of each chapter are summarized as follows:
Part I Fundamentals:
In Chap. 1, the importance of deep learning in computational mechanics is given
first and then the development process of deep learning is reviewed. In addition,
various new methods used in deep learning are introduced in an easy-to-understand
manner.
Chapter 2 is devoted to the mathematical aspects of deep learning. It discusses the
forward and backward propagations of typical network structures in deep learning,
such as fully connected feedforward neural networks and convolutional neural
networks, using mathematical formulas with examples, and also learning acceleration
and regularization methods.
Chapter 3 discusses the current research trends in this field based on articles
published in several journals. Many of these articles are compiled in the reference
list, which may be useful for further study.
Part II Case Study:
Chapter 4 presents an application of deep learning to the elemental integration
process of the finite element method. It is shown that a general-purpose numerical
integration method can be optimized for each integrand by deep learning to obtain
better results.
Chapter 5 introduces a method for improving the accuracy of the finite element
solutions by deep learning, showing how deep learning can break the common
knowledge that a fine mesh is essential to obtain an accurate solution.
Chapter 6 is devoted to an application of deep learning to the contact point search
process in contact analysis. It deals with contact between smooth contact surfaces
defined by NURBS and B-spline basis functions, showing how deep learning helps
to accelerate and stabilize the contact analysis.
Chapter 7 presents an application of deep learning to fluid dynamics. A convolu-
tional neural network is used to predict the flow field, showing its unparalleled speedy
calculation against that of conventional computational fluid dynamics (CFD).
Chapter 8 discusses further applications of deep learning to solid and fluid
analysis.
Part III Computational Procedures:
Chapter 9 describes some programs to be used for the application problems:
Sect. 9.1 programs in the field of computational mechanics, such as the element
viii Preface
stiffness matrix calculation program, and Sect. 9.2 those in the field of deep learning,
such as the feedforward neural network, both of which are given with background
mathematical formulas.
Chapter 10 presents programs for the application of deep learning to the elemental
integration discussed in Chap. 4. With these programs and those presented in Chap. 9,
the readers of the present book could easily try “Computational Mechanics with Deep
Learning” by themselves.
Tokyo, Japan Genki Yagawa

Tokushima, Japan Atsuya Oishi
May 2022
Acknowledgements
We would like to express our gratitude to Y. Tamura, M. Masuda, and Y. Nakabayashi

for providing the data for Chap. 7. We also express our cordial thanks to all the
colleagues and students who have collaborated with us over several decades in the
field of computational mechanics with neural networks/deep learning: S. Yoshimura,
M. Oshima, H. Okuda, T. Furukawa, N. Soneda, H. Kawai, R. Shioya, T. Horie, Y.
Kanto, Y. Wada, T. Miyamura, G. W. Ye, T. Yamada, A. Yoshioka, M. Shirazaki, H.
Matsubara, T. Fujisawa, H. Hishida, Y. Mochizuki, T. Kowalczyk, A. Matsuda, C.
R. Pyo, J. S. Lee, and K. Yamada.
We are particularly grateful to Prof. E. Oñate (CIMNE/Technical Univ. of
Catalonia, Spain) for his kind and important suggestions and encouragements during
the publication process of this book.
Tokyo, Japan Genki Yagawa

Tokushima, Japan Atsuya Oishi
ix
Contents
Part I Fundamentals
1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1 Deep Learning: New Way for Problems Unsolvable
by Conventional Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Progress of Deep Learning: From McCulloch–Pitts Model
to Deep Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.3 New Techniques for Deep Learning . . . . . . . . . . . . . . . . . . . . . . . . . 22
1.3.1 Numerical Precision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
1.3.2 Adversarial Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
1.3.3 Dataset Augmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
1.3.4 Dropout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
1.3.5 Batch Normalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
1.3.6 Generative Adversarial Networks . . . . . . . . . . . . . . . . . . . 31
1.3.7 Variational Autoencoder . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
1.3.8 Automatic Differentiation . . . . . . . . . . . . . . . . . . . . . . . . . . 39
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
2 Mathematical Background for Deep Learning . . . . . . . . . . . . . . . . . . . 49
2.1 Feedforward Neural Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
2.2 Convolutional Neural Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
2.3 Training Acceleration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
2.3.1 Momentum Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
2.3.2 AdaGrad and RMSProp . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
2.3.3 Adam . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
2.4 Regularization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
2.4.1 What Is Regularization? . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
2.4.2 Weight Decay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
2.4.3 Physics-Informed Network . . . . . . . . . . . . . . . . . . . . . . . . . 72
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
xi
xii Contents
3 Computational Mechanics with Deep Learning . . . . . . . . . . . . . . . . . . 75

3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
3.2 Recent Papers on Computational Mechanics with Deep
Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
Part II Case Study

4 Numerical Quadrature with Deep Learning . . . . . . . . . . . . . . . . . . . . . 95
4.1 Summary of Numerical Quadrature . . . . . . . . . . . . . . . . . . . . . . . . . 95
4.1.1 Legendre Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
4.1.2 Lagrange Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
4.1.3 Formulation of Gauss–Legendre Quadrature . . . . . . . . . . 98
4.1.4 Improvement of Gauss–Legendre Quadrature . . . . . . . . . 101
4.2 Summary of Stiffness Matrix for Finite Element Method . . . . . . . 103
4.3 Accuracy Dependency of Stiffness Matrix on Numerical
Quadrature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
4.4 Search for Optimal Quadrature Parameters . . . . . . . . . . . . . . . . . . . 114
4.5 Search for Optimal Number of Quadrature Points . . . . . . . . . . . . . 122
4.6 Deep Learning for Optimal Quadrature of Element
Stiffness Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
4.6.1 Estimation of Optimal Quadrature Parameters
by Deep Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
4.6.2 Estimation of Optimal Number of Quadrature
Points by Deep Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
4.7 Numerical Example A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
4.7.1 Data Preparation Phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
4.7.2 Training Phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
4.7.3 Application Phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
4.8 Numerical Example B . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
4.8.2 Training Phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
5 Improvement of Finite Element Solutions with Deep Learning . . . . . 139
5.1 Accuracy Versus Element Size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
5.2 Computation Time versus Element Size . . . . . . . . . . . . . . . . . . . . . 141
5.3 Error Estimation of Finite Element Solutions . . . . . . . . . . . . . . . . . 148
5.3.1 Error Estimation Based on Smoothing of Stresses . . . . . 148
5.3.2 Error Estimation Using Solutions Obtained
by Various Meshes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
5.4 Improvement of Finite Element Solutions Using Error
Information and Deep Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
5.5 Numerical Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
Contents xiii
5.5.2 Training Phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
6 Contact Mechanics with Deep Learning . . . . . . . . . . . . . . . . . . . . . . . . . 167
6.1 Basics of Contact Mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
6.2 NURBS Basis Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
6.3 NURBS Objects Based on NURBS Basis Functions . . . . . . . . . . . 180
6.4 Local Contact Search for Surface-to-Surface Contact . . . . . . . . . . 188
6.5 Local Contact Search with Deep Learning . . . . . . . . . . . . . . . . . . . 192
6.6.2 Training Phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
7 Flow Simulation with Deep Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
7.1 Equations for Flow Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
7.2 Finite Difference Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
7.3 Flow Simulation of Incompressible Fluid with Finite
Difference Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
7.3.1 Non-dimensional Navier–Stokes Equations . . . . . . . . . . . 218
7.3.2 Solution Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220
7.3.3 Example: 2D Flow Simulation of Incompressible
Fluid Around a Circular Cylinder . . . . . . . . . . . . . . . . . . . 221
7.4 Flow Simulation with Deep Learning . . . . . . . . . . . . . . . . . . . . . . . . 222
7.5 Neural Networks for Time-Dependent Data . . . . . . . . . . . . . . . . . . 225
7.5.1 Recurrent Neural Network . . . . . . . . . . . . . . . . . . . . . . . . . 225
7.5.2 Long Short-Term Memory . . . . . . . . . . . . . . . . . . . . . . . . . 230
7.6.2 Training Phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238
8 Further Applications with Deep Learning . . . . . . . . . . . . . . . . . . . . . . . 241
8.1 Deep Learned Finite Elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
8.1.1 Two-Dimensional Quadratic Quadrilateral
Element . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
8.1.2 Improvement of Accuracy of [B] Matrix Using
Deep Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247
8.2 FEA-Net . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253
8.2.1 Finite Element Analysis (FEA) With Convolution . . . . . 253
8.2.2 FEA-Net Based on FEA-Convolution . . . . . . . . . . . . . . . . 259
8.2.3 Numerical Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261
8.3 DiscretizationNet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262
xiv Contents
8.3.1DiscretizationNet Based on Conditional

Variational Autoencoder . . . . . . . . . . . . . . . . . . . . . . . . . . . 262
8.4 Zooming Method for Finite Element Analysis . . . . . . . . . . . . . . . . 269
8.4.1 Zooming Method for FEA Using Neural Network . . . . . 269
8.5 Physics-Informed Neural Network . . . . . . . . . . . . . . . . . . . . . . . . . . 275
8.5.1 Application of Physics-Informed Neural Network
to Solid Mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281
Part III Computational Procedures

9 Bases for Computer Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285
9.1 Computer Programming for Data Preparation Phase . . . . . . . . . . . 285
9.1.1 Element Stiffness Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . 285
9.1.2 Mesh Quality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 300
9.1.3 B-Spline and NURBS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 316
9.2 Computer Programming for Training Phase . . . . . . . . . . . . . . . . . . 325
9.2.1 Sample Code for Feedforward Neural Networks
in C Language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 326
in C with OpenBLAS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340
in Python Language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 366
9.2.4 Sample Code for Convolutional Neural Networks
in Python Language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 378
10 Computer Programming for a Representative Problem . . . . . . . . . . . 381
10.1 Problem Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 381
10.2 Data Preparation Phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383
10.2.1 Generation of Elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383
10.2.2 Calculation of Shape Parameters . . . . . . . . . . . . . . . . . . . . 385
10.2.3 Calculation of Optimal Numbers of Quadrature
Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 386
10.3 Training Phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 390
10.4 Application Phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 394
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 398
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 399
Part I
Fundamentals
Chapter 1
Overview
Abstract In this chapter, we provide an overview of deep learning. Firstly, in

Sect. 1.1, the differences between deep learning and conventional methods and also
the special role of deep learning are explained. Secondly, Sect. 1.2 shows a histor-
ical view of the development of deep learning. Finally, Sect. 1.3 gives various new
techniques used in deep learning.
1.1 Deep Learning: New Way for Problems Unsolvable

by Conventional Methods
Deep learning could be said as a revision of feedforward neural networks. Both of

them have been adopted in various fields of computational mechanics since their
emergence due to the reason that these techniques have a potential to compensate
for the weakness of conventional computational mechanics methods.
Let us consider a simple problem to know the role of deep learning in
computational mechanics as follows:
Problem 1 Assume a square plate, its bottom fixed at both ends, and
loaded partially at the top (Fig. 1.1a). Let us find the displacements
(u 1 , v1 ), . . . , (u 4 , v4 ) at the four points at the top (Fig. 1.1b).
The first solution method, which will be the simplest, is to actually apply a load
to the plate and measure the displacements, which may give the most reliable results
if it is easy to set up the experimental conditions and measure the physical quantity
of interest. This method can be called an experiment-based solution method.
The second method that comes to mind is to calculate the displacements by the
finite element analysis [32]. This problem can be solved by the two-dimensional
finite element stress analysis based on the following three kinds of equations.
Equations of balance of forces in an analysis region:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 3

G. Yagawa and A. Oishi, Computational Mechanics with Deep Learning,
Lecture Notes on Numerical Methods in Engineering and Sciences,
https://doi.org/10.1007/978-3-031-11847-0_1
4 1 Overview
(a) (b)
Fig. 1.1 Square plate under tensile force
{ ∂τx y
∂σx
∂x
+ ∂y
=0
∂τx y ∂σ y in Ω (1.1.1)
∂x
+ ∂y
=0
Equations of equilibrium at the load boundary:

{
σx n x + τx y n y = Tx
on ┌σ (1.1.2)
τ x y n x + σ y n y = Ty
Equations at the displacement boundary:

{
u=u
on ┌u (1.1.3)
v=v
Equation (1.1.1) is solved under the conditions Eqs. (1.1.2) and (1.1.3). Equa-
tion (1.1.2), which describes equilibrium at the load boundary, is called the Neumann
boundary condition, and Eq. (1.1.3), which describes the fixed displacements, the
Dirichlet boundary condition.
Based on the finite element method, Eqs. (1.1.1), (1.1.2) and (1.1.3) are formulated
as a set of linear equations as follows [72]:
[K ]{U } = {F} (1.1.4)
where [K ] in the left-hand side is called the coefficient matrix or the global stiffness
matrix, {U } the vector of nodal displacements, and {F} the right-hand side vector
calculated from the nodal equivalent load. The nodal displacements of all the nodes
in the domain to be solved are obtained by solving the simultaneous linear equations,
Eq. (1.1.4). For each of the four points specified in the problem, the displacements
1.1 Deep Learning: New Way for Problems Unsolvable … 5
of the point can be directly obtained as the nodal displacements if the point is a node,
or by interpolating the displacements of the surrounding nodes if it is not a node.
This solution method, which is based on the numerical solution of partial differential
equations, is called a computational method based on differential equation or, simply,
equation-based numerical solution method.
Then, we consider the following problem.
Problem 2 Assume the same square plate, its bottom fixed at both ends, and
loaded at one side of the top as Problem 1 (Fig. 1.1a). But, as shown in Fig. 1.2,
there is a hole inside the plate. Find the displacements (u 1 , v1 ), . . . , (u 4 , v4 ) at
the four points at the top (Fig. 1.1b).
The experiment for this problem may be more difficult than for the previous case.
Especially, if the domain is not a plate but a cube with a void being embedded, it will
be very time consuming to prepare for the experiment.
On the other hand, the equation-based numerical solution method can solve
Problem 2 without any difficulty by using a mesh divided according to the given
shape. This versatility of the equation-based numerical solution methods such as
the finite element method is a great advantage over the experiment-based solution
methods.
Supported by this advantage, it has become possible for numerical methods to
deal with almost all kinds of applied mechanics problems. Nowadays, the methods
are taken as the first choice for solving various problems.
However, it is clear that even the equation-based numerical solution method is
not a panacea if we consider the following problem.
Problem 3 Assume a square plate, its bottom fixed at both ends, loaded at one
side of the top (Fig. 1.1a) and the displacements at the four points at the top
(square hole) (triangular hole)
(round hole)
(a) (b) (c)
Fig. 1.2 Square plates with embedded holes

6 1 Overview
Fig. 1.3 Estimation of

shape and location of an
embedded hole
known as (u 1 , v 1 ), . . . , (u 4 , v 4 ). Find the shape and the position of an unknown

hole in the plate (Fig. 1.3).
Apparently, neither the experiment-based nor the equation-based numerical solu-

tion methods can solve this problem. So, what is the difference between Problems 1
and 2, and Problem 3?
In the equation-based numerical solution method for Problems 1 and 2, the
governing equations for displacements are solved under a given load condition called
as the Neumann condition and a fixation condition called as the Dirichlet condition,
and additional boundary conditions, such as the shape and the position of the hole,
where the displacements of all the nodes in the domain are obtained as the solution.
This is equivalent to an approach to achieve a mapping relation as follows:
⎧ ⎫
⎪
⎪ u1 ⎪⎪
⎧ ⎪
⎪ ⎪
⎪
⎨ Dirichret boundary condition ⎨ v1 ⎪
⎪ ⎬
g : Neumann boundary condition → .
.. (1.1.5)
⎩ ⎪
⎪ ⎪
⎪
Hole parameters ⎪
⎪ u4 ⎪⎪
⎩ ⎪
⎪ ⎭
v4
To solve the direct problem with the equation-based numerical method is equal
to find this kind of mapping.
On the other hand, the mapping relation to solve Problem 3 is expressed as
⎧
⎪
⎪ Dirichret boundary condition
⎪
⎪
⎪
⎪ Neumann boundary condition
⎪
⎪ ⎧ ⎫
⎪
⎪ ⎪
⎨ ⎪ u1 ⎪
⎪ ⎪
⎪
h: ⎪
⎪ v1 ⎪⎪ → Hole parameters
⎪ ⎨ ⎬ (1.1.6)
⎪
⎪ ..
⎪
⎪ ⎪ . ⎪
⎪
⎪ ⎪
⎪ ⎪
⎪
⎪ ⎪
⎪ u ⎪ ⎪
⎪
⎩ ⎩ 4⎪ ⎭
v4
Solving the inverse problem is equal to find the above kind of mapping. In this case,
it is to find mapping from the displacements that would usually be results obtained
by solving the governing equations to the hole parameters (shape and position of the
hole) that are conditions usually considered as input to solve the governing equations
[41].
It is clear that the inverse problem is much difficult to handle than the direct
problem, where the solution can be achieved directly through the routine operation
of solving equations. It is noted that an inverse problem such as Problem 3 is a type
of problem that we encounter often when we design an artifact, asking “How can we
satisfy this condition?” This means that solving inverse problems efficiently is one
of the most important issues for applied mechanics.
Now, omitting the parameters used in Eq. (1.1.5), we have
⎧ ⎫ ⎧ ⎫
⎪
⎪ u1 ⎪ ⎪ u1 ⎪
⎪ ⎪
⎪ ⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎨ ⎬ ⎨ v1 ⎪
⎪ v 1 ⎪ ⎪ ⎬
g : HoleParams → .. or ..
⎪ . ⎪ ⎪ . ⎪ = g(HoleParams) (1.1.7)
⎪
⎪ ⎪ ⎪ ⎪
⎪ u4 ⎪
⎪ ⎪ ⎪
⎪ u4 ⎪
⎪
⎩ ⎪ ⎭ ⎪
⎩ ⎪ ⎭
v4 v4
Similarly, Eq. (1.1.6) can be written in concise form as follows:

⎧ ⎫ ⎛⎧ ⎫⎞
⎪
⎪ u1 ⎪⎪ ⎪
⎪ u1 ⎪⎪
⎪
⎪ ⎪
⎪ ⎜⎪
⎪ ⎪
⎪⎟
⎪ v
⎨ 1⎬ ⎪ ⎜⎨ v 1 ⎪
⎪ ⎬⎟
.
.. ⎜ . ⎟
h: → HoleParams or HoleParams = h ⎜ .. ⎟ (1.1.8)
⎪ ⎪
⎪ ⎪ ⎜⎪
⎪ ⎪
⎪ ⎟
⎪
⎪ ⎪ ⎝⎪ u ⎪ ⎪⎠
⎪ u4 ⎪
⎩ ⎪
⎭
⎪
⎪
⎩ 4⎪ ⎭
v4 v4
Employing repeatedly the equation-based method to find the displacements

(u 1 , v1 ), . . . , (u 4 , v4 ) of the four points on the top of the plate for various hole
parameters, we can get a lot of data pairs of the hole parameters, HoleParams(i) =
( p1 (i), p2 (i), . . . , pn (i)), and the displacements calculated using the parameters,
(u 1 (i), v1 (i)), . . . , (u 4 (i), v4 (i)), shown as
8 1 Overview
{HoleParams(1), ((u 1 (1), v1 (1)), . . . , (u 4 (1), v4 (1)))}

{HoleParams(2), ((u 1 (2), v1 (2)), . . . , (u 4 (2), v4 (2)))}
.. (1.1.9)
.
{HoleParams(N ), ((u 1 (N ), v1 (N )), . . . , (u 4 (N ), v4 (N )))}
Now, let H () be an arbitrary function (mapping) with

(u 1 (i), v1 (i)), . . . , (u 4 (i), v4 (i)) as input and the approximate values of
HoleParams(i) as output, which is written as
( )
p1H (i), p2H (i), . . . , pnH (i) = H (u 1 (i), v1 (i), . . . , u 4 (i), v4 (i)) (1.1.10)
Then, let us find the H () among all the admissible candidates, that minimizes
Σ
N Σ
n
( )2
L= p j (i) − p Hj (i) (1.1.11)
i=1 j=1
As H (), which minimizes L, corresponds to the map h in Eq. (1.1.8), it is expected

that given the displacements (u 1 , v 1 ), . . . , (u 4 , v 4 ) as input, a set of values of the hole
parameters corresponding to the input data is estimated.
( )
p1H , p2H , . . . , pnH = H (u 1 , v 1 , . . . , u 4 , v 4 ) (1.1.12)
This approach is considered to be a solution method that attempts to derive a

solution by utilizing a large number of data as shown in Eq. (1.1.9) and can be called
a computational method based on data or, simply, a data-based solution method, one
of the most powerful solution methods for inverse problems.
Here, we discuss how we find the mapping H (), which minimizes L. It is known
that feedforward neural networks [27] and deep learning [22], an extension of feed-
forward neural networks, are able to construct a mapping H () from data pairs. Specif-
ically, H (), which corresponds to the mapping h in Eq. (1.1.8), can be constructed
by the error back propagation learning using the data in Eq. (1.1.9) as training data,
u 1 (i), v1 (i), . . . , u 4 (i), v4 (i) as input data, and p1 (i), p2 (i), . . . , pn (i) as teacher
signals (see Sect. 2.1).
As described above, feedforward neural networks and their advanced form, deep
learning are powerful techniques of data-based solution methods that can deal with
inverse problems difficult with conventional computational mechanics methods.
Finally, let us consider the following problem.
Problem 4 Assume a square plate with its bottom fixed at both ends, loaded
at one side of the top (Fig. 1.1a), and the displacements at the four points
on the top being measured as (u 1 , v1 ), . . . , (u 4 , v4 ). Then, find the shape and
Fig. 1.4 Estimation of

shape and location of an
embedded hole by
minimizing a function
the position of a hole in the domain that minimize (v1 − v4 )2 + (v2 − v3 )2

(Fig. 1.4).
The above problem, minimizing or maximizing some value, is that we often

encounter in designing artifacts, where we must seek an optimization point.
A possible way to solve Problem 4 is to repeatedly use the equation-based
methods to calculate (u 1 (i), v1 (i)), . . . , (u 4 (i), v4 (i)) and then (v1 (i) − v4 (i))2 +
(v2 (i) − v3 (i))2 to be minimized for all possible HoleParams(i), which is the so-
called “brute-force” method. However, the process of calculating the displacements
using the equation-based solution method, i.e., the analysis process using the finite
element method, is considered unpractical due to the enormous computational load.
On the other hand, the evolutionary computation algorithms, such as the genetic
algorithms [46], are often employed as we can reduce the number of calculation cycle
of the finite element analyses, resulting in high efficiency in solving optimization
problems. Specifically, the genetic algorithm is known to be efficient in finding an
optimal HoleParams(i) as the search area can be narrowed.
In addition, the data-based solution methods such as the feedforward neural
networks can also be used to dramatically reduce the huge computational load.
Based on the data in Eq. (1.1.9), let G() be an arbitrary function (mapping) with
HoleParams(i) as input and (u 1 (i), v1 (i)), . . . , (u 4 (i), v4 (i)) as output. Thus, we
have
( G ) ( )
u 1 (i), v1G (i) , . . . , u 4G (i), v4G (i) = G(HoleParams(i)) (1.1.13)
Then, among the broad range of admissible G()s, we find the G() that minimizes
the equation as follows:
4 {
Σ
N Σ
( )2 ( )2 }
u j (i) − u Gj (i) + v j (i) − v Gj (i) → min (1.1.14)
i=1 j=1
10 1 Overview
Here, G() outputs the displacements (u 1 , v1 ), . . . , (u 4 , v4 ) for the given

HoleParams, which is almost equivalent to the finite element analysis. In other words,
the finite element analysis, which is an equation-based solution method with high
computational load, can be replaced by a neural network with low computational
load, which is constructed by a data-based solution method. This kind of neural
network is often called a surrogate model of the original finite element analysis,
which is an example of the application of the data-based solution method to direct
problems.
As is well recognized, the equation-based numerical solution methods, such as
the finite element analysis, have been the major tool of conventional computational
mechanics, expanding their area of application to various problems. As a result,
majority of mechanical phenomena can now be solved by the equation-based numer-
ical solution method, replacing the experiment-based solution method. However, they
are still insufficient for solving inverse and optimization problems, both of which are
important in many fields. In contrast, the data-based solution methods such as neural
networks and deep learning can tackle rather easily the above problems. In other
words, the data-based solution method will remedy the weakness of the equation-
based numerical solution method, becoming a powerful way for tackling inverse and
optimization problems.
1.2 Progress of Deep Learning: From McCulloch–Pitts

Model to Deep Learning
In this section, the development of deep learning and its predecessor, feedforward
neural networks, is studied.
First, let us review a feedforward neural network, the predecessor of deep learning,
which is a network consisting of layers of units with connections between units in
adjacent layers. A unit performs multiple-input, single-output nonlinear transforma-
tion, similarly to a biological neuron (Fig. 1.5). In a feedforward neural network with
n layers, the first layer is called the input layer, the second to (n − 1)th layers the
intermediate or hidden layers, and the nth layer the output layer. Figure 1.6 shows
the structure of a feedforward neural network. The signal input to the input layer
is sequentially passed through the hidden layers and becomes the output signal at
the output layer. Here, the input signal undergoes a nonlinear transformation in each
layer. A feedforward neural network is considered “deep,” if it has five or more
nonlinear transformation layers [43].
A brief chronology of feedforward neural networks and deep learning is shown
as follows:
1943 McCulloch–Pitts model [45]
1958 Perceptron [56]
1967 Stochastic gradient decent [1]
1969 Perceptrons [47]
1.2 Progress of Deep Learning: From McCulloch–Pitts Model … 11
Fig. 1.5 Unit
Fig. 1.6 Feedforward neural

network
1980 Neocognitron [18]

1986 Back propagation algorithm [58]
1989 Universal approximator [19, 29]
1989 Convolutional neural network [42]
2006 Pretraining with restricted Boltzmann machine [28]
2006 Pretraining with autoencoders [4]
2012 AlexNet [40]
2016 AlphaGo [61]
2017 AlphaGo Zero [62]
The McCulloch–Pitts model was proposed as a mathematical model of biological
neurons [45], where inputs I1 , . . . , In are the outputs of different neurons, each input
is multiplied by weights w1 , . . . , wn and summed, and then bias θ is added to the
input u of the activation function as shown in Fig. 1.7. The neuron outputs a single
value f (u) as the output value O as follows:
12 1 Overview
Fig. 1.7 Mathematical

model of a neuron
( n )
Σ
O = f (u) = f wi Ii + θ (1.2.1)
i=1
In this model, the output of the neuron is binary (0 or 1), and the Heaviside function
is used as the activation function as
{
1 (u ≥ 0)
O = f (u) = (1.2.2)
0 (u < 0)
Later, the perceptron was introduced in 1958, demonstrating the ability of

supervised learning for pattern recognition [56]. Here, we discuss a two-class
(C1 , C2 ) classification problem for d-dimensional data x i = (xi1 , . . . , xid )T using
this model (Fig. 1.8). If we expand the dimension of the data by one and set
x i = (1, xi1 , . . . , xid )T , and set w = (w0 , w1 , . . . , wd )T for the weights, then u
in Eq. (1.2.1) can be described as
u = w T x i = w0 + w1 xi1 + w2 xi2 + · · · + wd xid (1.2.3)
Fig. 1.8 Perceptron

where w0 corresponds to θ in Eq. (1.2.1). Then, the classification rule with the
perceptron is written as
{ ( ( ) )
x i ∈ C 1 ( f (w T x i ) ≥ 0 )
(1.2.4)
x i ∈ C2 f w T x i < 0
As learning in the perceptron model can be regarded as the process of learning

weights w that enable correct classification, the correct weights can be found auto-
matically by repeating the iterative update of the values. When the weights in the
k-th step of the iterative updates are given as w(k) , the learning rule of the perceptron
model leaves w(k) unchanged if the classification using w(k) is correct for a certain
input data x i , and updates w (k) by x i if it is not the case, as
⎧ (k+1)
⎨w = w(k) ((for
( the case
) of correct classification)
)
(k+1) (k) (k)
w = w − αxi ( f (w xi ) ≥ 0 for xi ∈ C2 ) (1.2.5)
⎩ (k+1)
w = w(k) + αxi f w(k) xi < 0 for xi ∈ C1
where α is a positive constant. For x i ∈ C1 , we have

⎧ ( ( ) )
⎨ w(k+1) = w(k) f w(k) xi ≥ 0
T
( ( ) ) (1.2.6)
⎩ w(k+1) = w(k) + αxi f w(k) xi < 0
T
( )
If f w(k) T x i < 0 holds, we have
( ) (( )T ) ( )
f w(k+1) T x i = f w(k) + αx i x i = f w(k) T x i + α|x i |2 (1.2.7)
(Equation )(1.2.7) suggests that the weights are updated so that the value
(k+1)T
f w Xi approaches positive.
Iteratively applying this learning rule to all the input data, weights w that can
correctly classify all input data are determined. This learning rule was proven to
converge in a finite number of learning iterations, called the perceptron convergence
theorem [57]. The perceptron has attracted a great deal of attention, and the first
boom of neural networks occurred with it.
In 1969, however, the limitations of the perceptron were theoretically demon-
strated [47], that it was theoretically proven that a simple single-layer perceptron
could be applied only to linearly separable problems (Fig. 1.9), which cast doubt
on its applicability to practical classification problems. The hope for the perceptron
dropped drastically, and the first neural network boom calmed down.
The weakness of the perceptron, which was only effective for linearly separable
problems, was solved by making it multilayered, but causing a new demand for a
suitable learning algorithm.
14 1 Overview
Class 1
Class 2
(a) Linearly separable (b) Linearly inseparable
Fig. 1.9 Linearly separable and inseparable data a linearly separable b linearly inseparable
In 1986, the back propagation algorithm was introduced as a new learning algo-
rithm for multilayer feedforward neural networks, as shown in Fig. 1.6 [58], which
is known as an algorithm based on the steepest descent method that modifies the
connection weights between units in the direction of decreasing the error, which is
defined as the square of the difference between the output from the output layer unit
and the corresponding teacher data as follows:
1Σ P Σ
n n
L
( p L p )2
E= O j − Tj (1.2.8)
2 p=1 j=1
where
p
O Lj the output of the jth unit in the output at the Lth layer (output layer) for the
pth training pattern.
p
T j the teacher signal corresponding to the output of the jth unit in the output layer
for the pth training pattern.
nP the total number of training patterns.
nL the total number of output units.
Let w (k)
ji be the connection weight between the ith unit of the kth layer and the
jth unit of the (k + 1)th layer in Fig. 1.6, then the back propagation algorithm
successively modifies w (k)
ji as follows:
∂E
w (k) (k)
ji ← w ji − α (1.2.9)
∂w (k)
ji
Fig. 1.10 Sigmoid function 1.2

Sigmoid(x)
1.0 Heaviside(x)
0.8
0.6
0.4
0.2
0.0
-0.2
-4 -2 0 2 4
x
Here, α is a positive constant called the learning coefficient.

Since differentiation frequently appears in the back propagation, the sigmoid
function, that is nonlinear and continuously differentiable everywhere, has been used
as one of the most popular activation functions given as
1
f (x) = (1.2.10)
1 + e−x
Figure 1.10 shows the Heaviside function and the sigmoid function. It is seen that
the latter is a smoothed version of the former. (see Sect. 2.1 for details of the back
propagation algorithm.)
In 1989, it was shown that feedforward neural networks can approximate arbitrary
continuous functions [19, 29]. However, this theoretical proof is a kind of existence
theorem, and provides little answer to important practical questions such as how large
a neural network (number of layers, number of units in each layer, etc.) should be
used, what training parameters be used, and how many training cycles are required for
convergence. Accordingly, determination of such meta-parameters is usually made
by trial and error.
With the advent of the back propagation algorithm in 1986, multilayer feedfor-
ward neural networks were put to practical use, and the application range of neural
networks was greatly expanded, resulting in the second neural network boom. It
should be noted that almost twenty years earlier than the advent of the back propaga-
tion algorithm, the prototype of the algorithm was proposed [1], but its importance
was not widely recognized at that time.
After a while, the second neural network boom that had started with the advent
of the back propagation algorithm gradually calmed down. This was due to the fact
16 1 Overview
Fig. 1.11 Development of (GFLOPS)

1010
supercomputers
Fugaku
TaihuLight
108
K-computer
106 Roadrunner
Performance
EarthSimulator
4
10
ASCI Red
2
10
SX-3
Cray-2
100
Cray-1
10-2
1970 1980 1990 2000 2010 2020
Year
that when the scale of a feedforward neural network was increased to improve its
function and performance, the learning process became too slow or often did not
proceed at all. There were two main reasons for this: one the speed of the computer
and the other the vanishing gradient problem.
Let us consider first the speed of computers. Figure 1.11 shows the history of the
fastest supercomputers, where the vertical axis is the computation speed, defined
by the number of floating-point operations per second (FLOPS: Floating-point
Operations Per Second). The unit used here is Giga FLOPS (109 FLOPS).
It is seen from the figure that, in 1986, when the back propagation algorithm was
started, the speed of supercomputers was about 2 GFLOPS, it was 220 GFLOPS
in 1996, 280 TFLOPS (TeraFLOPS: 1012 FLOPS) in 2006, and 415 PFLOPS
(PetaFLOPS: 1015 FLOPS) in 2021.
A simple calculation suggests that the training time of a feedforward neural
network, which takes only one minute on a current computer (415 PFLOPS), took
1482 min (about one day) on a computer (280 TFLOPS) in 2006, 1,886,364 min
(about three and a half years) on a computer (220 GFLOPS) in 1996, and
207,500,000 min (about 400 years) on a computer (2 GFLOPS) in 1986. In reality,
this calculation is not necessarily true because of the effects of parallel processing
and other factors, but it still shows the speed of progress in computing speed, in other
words the slowness of the computers of old time, suggesting that it was necessary to
wait for the progress of computers in order to apply the back propagation algorithm
to relatively large neural networks.
As discussed above, the calculation speed of computers has been the big issue for
the back propagation algorithm. In addition to that, another barrier to the applica-
tion of large-scale multilayer feedforward neural networks to practical problems is
the vanishing gradient problem. This problem exists in multilayer neural networks,
where learning does not proceed in layers far away from the output layer, preventing
performance improvement by increasing the number of hidden layers. The cause of
the vanishing gradient is that the amount of correction by back propagation algorithm
as
∂E
Δw (k)
ji = −α (1.2.11)
∂w (k)
ji
becomes small in the deeper layers (layers close to the input layer) due to the
small derivative of the sigmoid function that was most commonly employed as the
activation function. (see Sect. 2.1 for details.)
Because of these issues, feedforward neural networks, while having the back
propagation learning algorithm and the versatility of being able to simulate arbitrary
nonlinear continuous functions, were “applied only to problems that a relatively
small network could handle.”
The serious situation described above changed in 2006 as the methods to avoid the
vanishing gradient problem by layer-by-layer pretraining [4, 28] and also those for
training multilayer feedforward neural networks without the issue were proposed.
Here, we discuss how the autoencoder is used to pretrain multilayer feedforward
neural networks. The structure of autoencoder is shown in Fig. 1.12, which is a
feedforward neural network with one hidden layer, and the number of units in the
input layer is the same as the number of units in the output layer. The autoencoder
is trained to output the same data as the input data by the error back propagation
learning using the input data as the teacher data. After the training is completed,
the autoencoder simply outputs the input data, which seems to be a meaningless
operation, but in fact it corresponds to the conversion of the input data into a different
representation format in the hidden layer. For example, if the number of hidden layer
units is less than the number of input layer units, a compressed representation of the
input data is obtained.
Fig. 1.12 Autoencoder

18 1 Overview
In pretraining with autoencoders, the connection weights of the multilayer feed-

forward neural network are initialized by autoencoders. For the case of a five-layer
network shown in Fig. 1.13, three autoencoders are prepared; the autoencoder A
with the first layer of the original network as input layer and the second layer of
the original network as hidden layer, the autoencoder B with the second layer as
input layer and the third layer as hidden layer, and the autoencoder C with the third
layer as input layer and the fourth layer as hidden layer. Note that for each of these
autoencoders, the output layer has as many units as corresponding input layer. The
pretraining using the autoencoders above is performed as follows:
(1) First, the autoencoder A is trained using the input data of the original five-layer
neural network. After the training is completed, the connection weights between
the input and the hidden layers of the autoencoder A are set to the initial values
of the connection weights between the first and second layers of the original
five-layer neural network.
(2) Then, the autoencoder B is trained, where the output of the hidden layer of
the autoencoder A is used as the input data for training. After the training is
completed, the connection weights between the input and the hidden layers of
the autoencoder B are set to the initial values of the connection weights between
the second and third layers of the original five-layer neural network.
4 C
5 3
2 3 B
1 2
2 A
Fig. 1.13 Pretraining using autoencoders

(3) Third, the autoencoder C is trained, where the output of the hidden layer of
the autoencoder B is used as the input data for training. After the training is
completed, the connection weights between the input and the hidden layers of
the autoencoder C are set to the initial values of the connection weights between
the third and fourth layers of the original five-layer neural network.
(4) Finally, after initializing the connection weights between each layer with the
values obtained by the autoencoders in (1), (2) and (3) above, the error back
propagation learning of the five-layer feedforward neural network is performed
using the original input and the teacher data.
Thus, one can solve the vanishing gradient problem by setting the initial values of
the connection weights between layers starting from those closest to the input layer
with autoencoders.
Another factor, which made possible feedforward neural networks deeply multi-
layered, is the significant improvement in computer performance, suggesting that
learning can now be completed in a practical computing time. As a result, the restric-
tions on the construction of feedforward neural networks have been relaxed, and
the scale of the neural network can be increased according to the complexity of
the practical problem. In addition, in 2007, CUDA [39], a language using graphics
processing units (GPUs) for numerical computation, was introduced, which have
become widely used as accelerators for training and inference of feedforward neural
networks, further improving computer performance.
With the development of the pretraining method and the significant improve-
ment of computer performance above, the third neural network boom has started
with the emergence of a multilayer large-scale neural network, the so-called deep
learning. The necessity of pretraining is, however, decreasing due to improvements
of activation functions, training methods and computer performance.
The success of deep learning is owing to the development of convolutional neural
networks also, in which the units of each layer are arranged in a two-dimensional grid.
In other words, in a conventional feedforward neural network, all units in adjacent
layers are connected to each other, whereas in a convolutional neural network, a unit
in a layer is connected to only some units in the precedent layer. Figure 1.14 shows
the structure and function of a convolutional neural network. The input to the (k, l)th
p
unit in the pth layer, Ukl , is given using the outputs of units in the (p − 1)th layer
p−1
Oi, j as follows:
Σ T −1
S−1 Σ
p p−1 p−1 p
Uk,l = h s,t · Ok+s,l+t + θk,l (1.2.12)
s=0 t=0
p p−1
where θk,l is the bias of the (k, l)th unit in pth layer, h s,t , the weight at (s, t) in (p −
1)th layer, which, unlike the weights in a fully connected feedforward neural network,
is identical between units in the same layer, S and T are the range of contributions
to the input, and Fig. 1.14 shows the case of S = T = 3.
20 1 Overview
Fig. 1.14 Convolutional neural network
The weights h s,t can be expressed in matrix form as in Fig. 1.15 for the case
of S = T = 3. The operation of Eq. (1.2.12) with h s,t is the same as the filter
operation in image processing [21]. Figure 1.16 shows examples of filters used in
image processing, Fig. 1.16a the Laplacian mask used for image sharpening, and
both Figs. 1.16b and c for edge detection, where the direction of the edge to be
detected is different between them. Note that the convolution operation represented
by Eq. (1.2.12) is similar to feature extraction in image processing, and when the
input data is an image, it can be interpreted as an operation to extract the features of
the input image. For details on the calculation in the convolutional layer, see Sect. 2.2.
From a historical point of view, the introduction of the locality such as convo-
lutional layers into feedforward neural networks had already been done in Neocog-
nitron [18], which was inspired by the hierarchical structure of visual information
processing [31]. Figure 1.17 shows the structure of Neocognitoron. The prototype
of the current convolutional layer was proposed in 1989 [42], which are known very
useful when images are employed as input. In addition to images, convolutional
Fig. 1.15 Matrix of weights

in convolutional neural
network
Fig. 1.16 Examples of

filtering masks
Fig. 1.17 Neocognitron. Reprinted from [18] with permission from Springer
neural networks have become widely used for various multidimensional data such
as voice or speech.
The ImageNet Large-Scale Visual Recognition Challenge (ILSVRC) [59] is an
image recognition contest using ImageNet, a large image dataset of over 10 million
images. In 2012, deep learning showed dominant performance in ILSVRC [40].
Since then, deep learning has been the best performer, and the winning systems of
the contest since 2012 are given as follows:
2012 AlexNet [40]
2013 ZFNet [69]
2014 GoogLeNet [66]
2015 ResNet [26]
2016 CUImage [70]
2017 SENet [30]
22 1 Overview
As the performance improvement was known to be achieved by adding more

layers, AlexNet in 2012 above used five convolutional layers, GoogLeNet in 2014
more than 20 layers, and ResNet in 2015 more than 100 layers.
The effect of adding more convolutional layers has been verified in VGG [63],
which has a scalable structure and is considered a standard deep convolutional neural
network. It is shown that increasing the number of convolutional layers in VGG
improves the performance in classification.
In addition to image recognition, deep learning has been applied to game program-
ming. AlphaGO, a Go program using deep learning, defeated the world champion
[61]. AlphaGO has continued to evolve since then. While the original AlphaGO
used actual game records played by human players as training data, AlphaGO Zero
[62], an improved version of AlphaGO, adopted reinforcement learning, learning by
playing against other AlphaGO Zeros with the performance superior to that of its
precedent Alpha GO.
In addition to the above areas, deep learning has been applied to a wide range
of fields, including automatic driving such as traffic sign identification [10], natural
language processing such as machine translation [65] and language models GPT-3
[8] and BERT [16], and speech recognition [11, 54].
As deep learning has been applied to various fields, computational power has
been enhanced to deal with deep learning-specific processing. Large-scale and high-
performance computers based on CPUs and GPUs are used for the training process
of deep learning, which has made it possible to construct large-scale deep neural
networks. As a result, the amount of computation required for the inference process
using trained neural networks has increased rapidly. Usually, the inference process
is performed under the user’s control with computers much less powerful than those
used for training. They are called edge computing under computers embedded in
mobile devices such as smartphones, home appliances, and industrial machines.
Since real-time performance is needed in this inference process, embedded GPUs
such as Nvidia’s Jetson are often used, and accelerators that specialize in accelerating
the computation of deep learning inference are also being developed [6, 15].
Deep learning continues to develop, achieving remarkable results in various fields.
1.3 New Techniques for Deep Learning
In this section, some of new and increasingly important techniques in deep learning
are discussed.
1.3.1 Numerical Precision
First, let us study the numerical accuracy required for deep learning.
1.3 New Techniques for Deep Learning 23
It is well known that basic numbers employed in computers are binary, and there
are several formats for floating-point real numbers, among which we choose one
depending on the necessary precision level [67]. The floating-point real number is
represented by a series of binary digits, which consist of a sign part to represent to
be positive or negative, an exponent part for order, and a mantissa part for significant
digits, with the total length (number of bits) varying according to the precision of
the number. The IEEE754 standard specifies three types of real numbers, a double-
precision real number (FP64) with about 16 decimal significant digits, a single-
precision real number (FP32) with about 7 decimal digits, and a half-precision real
number (FP16) with about 3 decimal digits, which occupy 64, 32, and 16 bits of
memory, respectively. In the field of computational mechanics, FP64 is usually used,
and some problems even need quadruple-precision real numbers.
In contrast to the above, it has been shown that deep learning can suffice accuracy
even when using real numbers with relatively low precision [12–14, 25]. Although
training in deep learning usually requires higher numerical precision due to the
calculation of derivatives than when inferring with trained neural networks, it has
been shown that FP32 and FP16 are sufficient even for training.
For this reason, new low-precision floating-point real number formats are also
being used for deep learning, including BFloat16 (BF16) and Tensor Float 32 (TF32).
The former, proposed by Google, has more exponent bits than FP16 with the same
exponent bits as FP32. Since the number of bits in the mantissa part is reduced, that of
significant digits is also reduced, but the range of numbers that can express is almost
the same as that of FP32. The latter, proposed by Nvidia, has the same number of
digits in the exponent part as FP32, and the same number of digits in the mantissa
part as FP16. The TF32 format has 19 bits in total, meaning a special format whose
length is not a power of 2. The major floating-point number formats are summarized
in Fig. 1.18.
sign (1bit) exponent fraction
FP64 11bits 52bits
FP32 8bits 23bits
FP16 5bits 10bits
BF16 8bits 7bits
TF32 8bits 10bits
Fig. 1.18 Floating-point real number formats

24 1 Overview
Fig. 1.19 Pseudo-low-precision method
Since deep learning requires rather low numerical precision, it is possible to

further improve performance of deep learning by using integer [20] or fixed-point
real number formats [25], or to implement dedicated arithmetic hardware for deep
learning on field programmable gate array (FPGA).
In practice, it is difficult to know the numerical precision required for each
problem. However, a rough estimate can be made using the pseudo-low-precision
(PLP) method [50]. For example, a single-precision real number (FP32) has a
mantissa part of 23 bits out of a total length of 32 bits, but if we shift the entire
32 bits to the right by n bits and then to the left by n bits, the last n digits of the
mantissa part are filled with zeros, and the number of digits in the mantissa part is
reduced to (23 − n) bits. Figure 1.19 shows an example of PLP with 8-bit shifting,
and List 1.3.1 the sample code to verify the operation of PLP, where the int type
is assumed to be 32 bits. List 1.3.2 shows an example of the execution using gcc
4.8.5 on CentOS 7.9. Note that all the arithmetic operations above are performed as
single-precision real numbers (FP32) and the results are stored as single-precision
real numbers (23-bit mantissa), so it is necessary to reduce the precision by PLP
again immediately after each arithmetic operation.
List 1.3.1 PLP test code
#include <stdio.h>
typedef union{
float f;
int i;
}u_fi32;
int main(void){
int nsh;
float f1,f2;
u_fi32 d1,d2,d3;
f1 = 2.718281828;
f2 = 3.141592653;
for(nsh=1;nsh<23;nsh++){
d1.f = f1 ;
d1.i = d1.i >> nsh ;
d1.i = d1.i << nsh ;
d2.f = f2 ;
d2.i = d2 i >> nsh ;
d2.i = d2.i << nsh ;
d3.f = d1.f*d2.f ;
d3.i = d3.i >> nsh ;
d3.i = d3.i << nsh ;
printf(“%2d %f %f %f\n”,nsh,d1.f,d2.f,d3.f);
}
return 0;
}
List 1.3.2 Results of the PLP test code (CentOS 7.9, gcc 4.8.5)
1 2.718282 3.141593 8.539734
2 2.718282 3.141592 8.539730
3 2.718281 3.141592 8.539726
4 2.718281 3.141590 8.539719
5 2.718277 3.141586 8.539673
6 2.718277 3.141586 8.539673
7 2.718262 3.141571 8.539551
8 2.718262 3.141541 8.539307
9 2.718262 3.141479 8.539062
10 2.718262 3.141357 8.538086
11 2.718262 3.141113 8.537109
12 2.717773 3.140625 8.535156
13 2.716797 3.140625 8.531250
14 2.714844 3.140625 8.515625
15 2.710938 3.140625 8.500000
16 2.703125 3.140625 8.437500
17 2.687500 3.125000 8.375000
18 2.687500 3.125000 8.250000
19 2.625000 3.125000 8.000000
20 2.500000 3.000000 7.500000
21 2.500000 3.000000 7.000000
22 2.000000 3.000000 6.000000
1.3.2 Adversarial Examples
Deep learning has shown very good performance in image recognition and is said to
surpass human ability of discrimination in some areas. However, it has been reported
that deep learning can misidentify images that can be easily identified by humans.
Goodfellow et al. [23], employing an image that should be judged to be a panda
on which a small noise is superimposed, show that the superimposed image looks
almost identical to the original image to the human eye or can be easily identified
as a panda, whereas the convolutional neural network GoogLeNet [66] judges it as
a gibbon. This kind of input data is called an adversarial example.
26 1 Overview
The mechanism by which an adversarial example occurs in a neural network can

be explained as follows.
Let the input to the neural network be x = (x1 , . . . , xn )T , and the weights w j =
( )T
w j1 , . . . , w jn , then the input to the jth unit of the next layer can be written as
Σ
uj = w ji Ii = w Tj x (1.3.1)
i
When a small noise Δx is added to the input, the variation of u j , Δu j , is given

by
Δu j = w Tj Δx (1.3.2)
Equation (1.3.2) shows that Δu j is the inner product of w j and Δx, and therefore
the variation Δu j takes the maximum value when Δx = kw j , showing that among
various noises of similar magnitude, the variation of the input to a unit, and also the
output of the unit, becomes the largest for noise vector Δx with the specific direction,
i.e., parallel to w j .
For a well-trained multilayer feedforward neural network, we can also make the
output fluctuate greatly with small fluctuations of input as follows. When the error
function of the neural network is represented as E = E(x), the input noise vector
Δx = (Δx1 , . . . , Δxn )T , and the small positive constant ε , then adding the noise
vector Δx generated by
( )
∂E
Δxi = ε · sgn (1.3.3)
∂ xi
to the input vector, we can increase a significant difference between the output of
neural network and the teacher data. Here, sgn(x) is defined as follows:
{
1 (x ≥ 0)
sgn(x) = (1.3.4)
−1 (x < 0)
Adversarial examples being created as above, we discuss here how we make a

feedforward neural network that can identify an adversarial example as correctly as
possible? One method is to use the adversarial example as a regularization term [27].
Though
( the usual
) back propagation algorithm minimizes E(x), this method adds
E x + Δx adv as a regularization term to minimize the equation as follows:
( )
α · E(x) + (1 − α) · E x + Δx adv (1.3.5)
where α is a positive constant, and Δx adv a noise vector generated by Eq. (1.3.3).
1.3.3 Dataset Augmentation
In neural networks and deep learning, the number of training patterns is one of the
most important issues. If it is small, overtraining [27] is likely to occur. To avoid
this, it is necessary to have as many training patterns as possible. In many situations,
however, it is not easy to collect a sufficient number of training patterns as seen in
the case of medical imaging (X-ray, MRI, etc.).
For this reason, the original training patterns (images) are processed to make
new training patterns, called dataset augmentation. As an example, in the deep
learning library Keras, its ImageDataGenerator function provides images that have
been processed in various ways, such as rotation, translation, inversion, and shear
deformation, to increase the number of training patterns. Table 1.1 shows the main
parameters of ImageDataGenerator and their effects.
Even when the input data are audio data, the data augmentation described above
is performed and proved to be effective [34, 53].
Data augmentation for images is considered to be difficult in such case as a fully
connected feedforward neural network. Then, another data augmentation method,
the superimposition of noise, is studied [60], which is based on
input original
xi = (1 + ri )xi , ri ∈ [−ε, ε] (1.3.6)
where ε is a small positive constant. During training, a small noise is superimposed

original
on each component xi of the original input data x original by using a random
input
number generated each time, which is used as the component xi of the input data
input
x . Note that superimposing noise on the input data is reported to be effective in
preventing overtraining in various applications [17, 51, 52].
Table 1.1 Image data augmentation in Keras

Parameter of ImageDataGenerator Functionality
Rotation_range Rotation
Width_shift_range Horizontal translation
Height_shift_range Vertical translation
Shear_range Shear transform
Zoom_range Zooming
Horizontal_flip Horizontal flip
Vertical_flip Vertical flip
Fill_mode Points outside of the generated image are filled according to
the given mode
28 1 Overview
1.3.4 Dropout
Studied is a method to improve accuracy of classification by constructing multiple

neural networks and averaging or majority voting on them, called the ensemble
learning or the model averaging.
Suppose a highly idealized situation, where we have 2N + 1 neural networks,
each of which is different
( from the other, and that all of them have the same accuracy
1 √ )
of classification p p > 2 . In this case, if the classification results of each neural
network are independent, then the following equation holds for the accuracy of
classification P2N +1 by majority vote of the 2N + 1 neural networks,
P2N +1 < P2(N +1)+1 , lim P2N +1 = 1 (1.3.7)

N →∞
This is referred to as Condorcet’s theorem. In practice, the assumption that the

classification results of each classifier (e.g., neural network) are independent is unrea-
sonable, and it is often the case that many classifiers make the same misclassification
for a given input data.
To relax this, for example, in the random forests [7, 49], where ensemble learning
is introduced to decision trees and classification is done by majority voting of a large
number of decision trees, the structure of the training patterns of individual decision
trees is changed, and the parameters for discrimination are also changed among trees.
This, nevertheless, still does not result in the construction of a fully independent set
of classifiers. Even though, the method of preparing multiple classifiers is reported
to be effective in improving the accuracy of classification in many cases.
Dropout [64] is equivalent to averaging multiple neural networks while using a
single neural network. Figure 1.20 shows the schematic diagram of the dropout.
Figure 1.20a shows the original four-layer feedforward neural network. During
training, a uniform random number r nd of the range [0, 1] is generated for each
unit in each epoch, and if r nd > r for a predetermined dropout rate r (0 < r < 1),
the output of the unit is fixed to 0. This is equivalent to using a feedforward neural
network with a different structure for each epoch (Fig. 1.20b). After the training is
completed, the neural network with the original structure using all units is employed
for inference, but the output of the units is multiplied by the dropout rate r (Fig. 1.20c).
Dropout is almost equivalent to taking the average value of many neural networks
with different structures, considered to suppress overtraining and improve accuracy
of estimation.
DropConnect [68] has also been proposed, which drops individual couplings
between units, whereas the dropout does individual units.
Fig. 1.20 Dropout
1.3.5 Batch Normalization
For data pairs that are to be used as input and teacher data for neural networks to
learn mapping, it is common to transform the data into a certain range of values,
or to process the data to align the mean and variance. Batch normalization is that
dynamically performs the above transformation also in hidden layers.
As mentioned above, transformation operations on input data of a neural network
are usually
( p performed. Let the number of input data be n, and the pth input data
p p)
x p = x1 , x2 , . . . , xd . The maximum value, minimum value, mean, and standard
deviation of each component are, respectively, calculated as follows:
{ p}
ximax = max xi (1.3.8)
p
{ p}
ximin = min xi (1.3.9)
p
1Σ p
n
μi = x (1.3.10)
n p=1 i
⎡
l Σ
l1 n ( p )2
σi = √ xi − μi (1.3.11)
n p=1
30 1 Overview
The most commonly employed transformation operations are the 0–1 transfor-
mation and the standardization.( Assuming that) the input data to the neural network
p p p
after transformation are x̃ p = x̃1 , x̃2 , . . . , x̃d , the 0–1 transformation of the input
data is given by
p
p xi − ximin
x̃i = (1.3.12)
ximax − ximin
Similarly, the standardization of the input data is given as

p
p xi − μi
x̃i = (1.3.13)
σi
The above transformation can mitigate the negative effects of large difference in
numerical ranges between individual parameters.
The batch normalization [33] performs the same operations on the inputs of each
layer as on the input data. When the input values of the ith unit of the lth layer in
p p
the pth learning pattern are xl,i and the output is yl,i , the input–output relationship is
expressed by
⎛ ⎞
( p)
Σ
= f⎝ wi,l j yl−1, j + θl,i ⎠
p p
yl,i = f xl,i (1.3.14)
j
where θl,i is the bias of the ith unit in the lth layer, wi,l j the connection weight between
the ith unit in the lth layer and the jth unit in the (l − 1)th layer.
The batch normalization is used to standardize the input values of each unit in a
mini-batch of size m. That is, if the input values of}the units in each training pattern
{ k+1
in a mini-batch are xl,i , x k+2 , . . . , xl,i
k+m−1
, xl,i
k+m
, then we employ as input values
{ k+1 l,ik+2 }
the transformations x̃l,i , x̃l,i , . . . , x̃l,ik+m−1
, x̃l,i
k+m
given as
p
p xl,i − μl,i
x̃l,i = γ + β, (k + 1 ≤ p ≤ k + m) (1.3.15)
σl,i
Here, both γ and β are parameters that are updated by learning, and μl,i and σl,i
are, respectively, calculated by
1 Σ p
k+m
μl,i = x (1.3.16)
m p=k+1 l,i
⎡
l
Σ ( p
l 1 k+m )2
σl,i = √ x − μl,i + ε (1.3.17)
m p=k+1 l,i
where ∈ is a small constant that prevents division by zero.

It is noted that the batch normalization has been shown to be effective in improving
learning speed in many cases and has become widely employed.
1.3.6 Generative Adversarial Networks
Generative adversarial networks (GANs) [24], one of the most innovative techniques
developed for deep learning, consist of two neural networks: the generator and the
discriminator.
The former is a neural network to generate data that satisfies certain conditions,
and the latter that to judge whether the input data is true data or not. The generator
takes arbitrary data (for example, arbitrary data generated by random numbers) as
input and is trained to output data that satisfies certain conditions. The discriminator
is trained to correctly discriminate between data output by the generator (called fake
data) and data prepared in advance that truly satisfies certain conditions (called real
data). In the early stages of learning, the fake data output by the generator can be
easily detected as “fake” by the discriminator, but as the training of the generator
progresses, it may output fake data that cannot be detected even by the discriminator.
The goal of GAN is to build a generator that outputs real data that satisfies certain
conditions, or those that cannot be detected as “fake data” by the discriminator. The
training process of GAN can be summarized as follows:
(1) Two neural networks, the generator and the discriminator, are prepared as shown
in Fig. 1.21. The number of output units of the generator should be the same as
the number of input units of the discriminator, and the number of output units
of the discriminator is 1 because the discriminator determines whether the input
data is real or fake only.
Fig. 1.21 Generator and discriminator in GAN

32 1 Overview
(2) Prepare a large number of true data that satisfies certain conditions, called real
data.
(3) The generator takes a lot of arbitrary data generated by random numbers as
input to collect a lot of output data of the generator called fake data (Fig. 1.22).
(4) Training of the discriminator is performed using the real data prepared in (2) and
the fake data collected in (3) as input data. As shown in Fig. 1.23, the teacher
data should be real (e.g., 1) for real data and fake (e.g., 0) for fake data. In this
way, the discriminator is trained to correctly discriminate between real and fake
data.
(5) After the training of the discriminator, the generator is trained by connecting
the generator and the discriminator in series, as shown in Fig. 1.24. The input
Fig. 1.22 Generation of Output (Fake Data)

fake data in GAN
Noise (Arbitrary Input)
Fig. 1.23 Training of Fake Teacher Signal Real

discriminator in GAN
Fake Data Real Data

(Generator Output)
Fig. 1.24 Training of Teacher Signal Real

generator in GAN
Discriminator
(weights fixed)
Generator
Noise (Generator Input)
data to the connected network is those of the generator, and the teacher data is
“real.” The back propagation algorithm is used to train the connected network,
where all the parameters (e.g., connection weights) of the discriminator part are
fixed, and only the parameters of the generator part are updated. In this way, the
generator is trained to output data that is judged to be real by the discriminator.
(6) Return to (3) after the training of the generator is completed.
Repeating the training of the discriminator and the generator alternately, the
trained generator finally becomes able to output data that is indistinguishable from
the real data by the discriminator. The GAN can use convolutional neural networks
for the generator and the discriminator.
The training process of GAN is written as a min–max problem as follows [24]:
⎡ ⏋ ⎡ ⏋
min max V (D, G) = E x∼ pd (x) log D(x) + E z∼ pz (z) log(1 − D(G(z))) (1.3.18)
G D
where V (D, G) is the objective function, D and G the discriminator and the gener-
ator, respectively, D(x) is the output of the discriminator for input data x, G(z) the
output of the generator for input data z, pd (x) the probability distribution of x, and
p z (z) the probability distribution of z. The training process (4) above is understood
to be the max operation of the left-hand side of Eq. (1.3.18), that is, the maximization
of the right side by updating the discriminator, while the training process (5) above
is the min operation on the left-hand side of Eq. (1.3.18), meaning the minimization
of the second term on the right-hand side by updating the generator.
34 1 Overview
GANs based on convolutional neural networks are used for various problems
including generation of images. For example, it can generate an image of a dog from
a noisy image with random numbers as input. However, it is known to be difficult to
control what kind of dog image the generator creates.
Conditional generative adversarial network (CGAN) [48] is a modified version
of GAN that can control the images generated by the generator, where both the
generator and the discriminator accept the same input data as in GAN as well as data
about the attributes of the data called label data. The training process of CGAN is
summarized as follows:
(1) Two neural networks, the generator and the discriminator, are prepared as shown
in Fig. 1.25. Both the generator and the discriminator use information about the
attributes of the data (Label data) as input data in addition to the standard GAN
input data. Thus, the number of input units of the discriminator is the sum of
two numbers: one is the number of output units of the generator and the other
the number of label data. The number of output units of the discriminator is 1
because the discriminator determines whether the input data is real or fake only.
(2) Prepare a large number of true data that satisfies certain conditions, called real
data, which are accompanied by label data indicating their attributes.
(3) The generator takes a lot of arbitrary data generated by random numbers and
their attribute information (label data) as input to collect a lot of output data
(fake data) (Fig. 1.26).
(4) Training of the discriminator is performed using the real data prepared in (2)
and the fake data collected in (3) as input data. The corresponding label data
are also used as input. As shown in Fig. 1.27, the teacher data are set real (e.g.,
1) for real data and fake (e.g., 0) for fake data, respectively. In this way, the
discriminator is trained to correctly discriminate between real and fake data.
(5) After the discriminator is trained, the training of the generator is performed by
connecting the generator and the discriminator in series, as shown in Fig. 1.28.
Label Input Data Label Input Data

(Noise)
Generator Discriminator
Fig. 1.25 Generator and discriminator in CGAN

Fig. 1.26 Generation of Output (Fake Data)

fake data in CGAN
Label Input Data

(Noise)
Fake Teacher Signal Real
Label Fake Data Label Real Data

(Fake Data) (Generator Output) (Real Data)
Fig. 1.27 Training of discriminator in CGAN
The input data to the connected network are the input data of the generator
and its label data with the teacher data being “real.” The back propagation
algorithm is used to train the connected network, where all the parameters (e.g.,
connection weights) of the discriminator part are fixed, and only the parameters
of the generator part are updated. In this way, the generator is trained to output
data judged to be real by the discriminator.
(6) Return to (3) after the training of the generator is completed.
36 1 Overview
Fig. 1.28 Training of Teacher Signal Real

generator in CGAN
Discriminator
(weights fixed)
Label Generator
(Fake Data)
Label Input Data

(Noise)
The learning process of CGAN is also formulated to be a min–max problem as

follows [48]:
⎡ ⏋ ⎡ ⏋
min max V (D, G) = E x∼ pd (x) log D(x| y) + E z∼ pz (z) log(1 − D(G(z| y)))
G D
(1.3.19)
where y is the attribute information (Label data). Equation (1.3.19) can be regarded
as the modified version of Eq. (1.3.18) conditioned with respect to y.
GANs have attracted much attention, especially for their effectiveness in image
generation and speech synthesis, and various improved GANs have been proposed.
Researches on them are still active: for example, DCGAN [55] using a convolutional
neural network, InfoGAN [9] with an improved loss function, LSGAN [44] with a loss
function based on the least square error, CycleGAN [71] with doubled generators and
discriminators, WGAN [2] with a loss function based on the Wasserstein distance,
ProgressiveGAN [35] with hierarchically high resolution, and StyleGAN [36] with
an improved generator.
1.3.7 Variational Autoencoder
It is known that the variational autoencoder performs the similar function as the
generative adversarial network (GAN) described in Sect. 1.3.6.
Figure 1.29 shows the basic schematic diagram of the autoencoder. Let the number
( )T
of training data be N , the kth training data (input) x k = x1k , x2k , . . . , xnk , the
( ) T
encoder output yk = y1k , y2k , . . . , ymk (m < n), and the decoder output x̃ k =
( k k ) T
x̃1 , x̃2 , . . . , x̃nk . Then, the objective function E to be minimized in the training
process of the autoencoder is given by
1 Σll k l2
N
l
E= l x̃ − x k l (1.3.20)
2 k=1
Here, it is assumed that the input data are used as the teacher data also. This leads
to that the output yk of the encoder is considered to be a compressed representation
of the input data x k .
Unlike the conventional autoencoders, the variational autoencoders [37, 38] learn
distributions of probability. Figure 1.30 shows a schematic diagram of the operation
of a variational autoencoder. The encoder is assumed to represent the probability
distribution of Eq. (1.3.21). Here, z = (z 1 , z 2 , . . . , z m )T is called a latent variable and
is usually much ( lower ) dimensional (m « n) than the input x = (x1 , x2 , . . . , xn )T .
Note that N z|μ, σ 2 is a multidimensional normal distribution with mean μ =
(μ1 , μ2 , . . . , μm )T and variance σ 2 (standard deviation σ = (σ1 , σ2 , . . . , σm )T ).
( )
qφ (z|x) = N z|μ, σ 2 (1.3.21)
Fig. 1.29 Encoder and Output

decoder in an autoencoder
Decoder
Code
Encoder
Input
38 1 Overview
Fig. 1.30 Variational Output

autoencoder
Decoder
Encoder
Input
The encoder outputs the mean μ and the standard deviation σ (square root of
variance) as parameters of the probability
( distribution
) with the input x.
Though z is to be sampled from N z|μ, σ 2 , it is practically determined in the
variational autoencoder to enable the error back propagation learning as
z = μ + εσ (1.3.22)
where ε is a number sampled from N (ε|0, 1). Equation (1.3.22) is called the
reparameterization trick.
On the other hand, the decoder is assumed to represent the probability distribution
as follows:
pθ (z) = N (z|0, I) (1.3.23)
The objective function E to be minimized in the training process of the variational

autoencoder is given by [37],
E = E KL + E Recon (1.3.24)
where E KL is calculated from the Kullback Leibler (KL) divergence [5], which
is a measure of the difference, representing the distance between two probability
distributions qφ (z|x) and pθ (z), and given by
1 Σ( )
m
E KL = − 1 + log σ j2 − μ2j − σ j2 (1.3.25)
2 j=1
On the other hand, E Recon is called to be reconstruction error, which is expressed

by
1Σ ( )
L
E Recon = − log pθ x|z k (1.3.26)
L k=1
where L is the number of ε (i.e., the number of z) for a single input x. As the output
will change with the value of, averaging is naturally performed. For example, when
input and output are image data (each pixel is represented by a real number (0, 1)),
we often use the equation as follows:
1 Σ Σ{ ( )}
L n
E Recon = − xi log yik + (1 − xi ) log 1 − yik (1.3.27)
L k=1 i=1
where yik is a pixel of the image generated from z k .
1.3.8 Automatic Differentiation
As written in Sect. 1.2, the differentiation is frequently used in the error back propa-
gation learning. (see Sect. 2.1 for details.) Libraries for deep learning (such as tensor-
flow) are usually equipped with an automatic differentiation function for calculating
derivatives, which is useful for general purposes as well.
As is well known, the derivative of a function f (x) is defined as,
df f (x + h) − f (x)
= lim (1.3.28)
dx h→0 h
Similarly, the partial derivative of a function f (x, y) is written as follows:
∂f f (x + h, y) − f (x, y)
= lim (1.3.29)
∂x h→0 h
∂f f (x, y + h) − f (x, y)
= lim (1.3.30)
∂y h→0 h
On the other hand, several methods for differentiation are available on a computer,
including numerical differentiation, symbolic differentiation and automatic differ-
entiation. Let us clarify the difference between them.
40 1 Overview
First, numerical differentiation is basically a method that attempts to find the

derivative according to the definition of differentiation. Since limit operations are
difficult to perform on a computer, the derivative of a function f (x) is calculated
with a small value of h as follows:
df f (x + h) − f (x)
≈ (1.3.31)
dx h
Viewing the definition of the derivative (Eq. (1.3.28)), it is expected that a good
approximation can be obtained by making h sufficiently small. In reality, however,
the derivative becomes zero when h is very small as
f (x + h) − f (x)
∃δ > 0, |h| < δ, =0 (1.3.32)
h
This is due to the reason that f (x + h) and f (x) are interpreted as the same value
in a computer for extremely small value of h, since the precision of the numerical
representation in a computer is limited in finite digits. Thus, h cannot be made suffi-
ciently small, making it difficult to obtain accurate derivative values with numerical
differentiation.
On the other hand, in symbolic differentiation, derivatives are obtained in the form
of equations, which results in accurate evaluation of derivative values. There have
been some software systems that can perform symbolic differentiation: REDUCE and
Maxima, which have been developed since the 1960s. In addition to them, commer-
cial software such as Mathematica, Maple, and Derive, which are equipped with
formula manipulation system including symbolic differentiation. In terms of deep
learning, the partial differentiation of the output of a neural network by the connec-
tion weight is often encountered, but its mathematical expression can be complicated
(see Sect. 2.1), and the benefit of obtaining the mathematical expression in explicit
form is little. The error back propagation learning does not require explicit expres-
sions of derivatives and it is sufficient to know the values of the derivatives. For this
reason, obtaining the derivative as a mathematical expression is considered too much
process for deep learning.
The last method, automatic differentiation, provides accurate derivative values
unlike numerical differentiation. Although automatic differentiation cannot provide
rigorous mathematical expressions in explicit form unlike symbolic differentiation,
the method is useful for deep learning because it calculates the exact derivative value
based on the chain rule of differentiation, where all arithmetic operations are repre-
sented by computational graphs and automatic differentiation uses computational
graphs.
As a simple example, Fig. 1.31 shows a computational graph for z = x + y. The
nodes in a computational graph represent numerical values, and the lines connecting
the nodes represent operations (such as arithmetic operations) on the numerical
values.
Fig. 1.31 Computational

graph
Figure 1.32 shows the structure of a three-layer feedforward neural network, where
the output of the unit in the output layer is represented as O1 , the output of the units
in the hidden layer as H1 , H2 , and H3 , the output of the unit in the input layer unit
(i.e., input) as I1 and I2 , the connection weights between the unit in the output layer
and units in the hidden layer as c11 , c12 , c13 , and the connection weights between the
units in the hidden layer and the units in the input layer as b11 , . . . , b32 . Let f () be
the activation function of the units in the output layer and also in the hidden layer.
For simplicity, we assume that the bias values of all units are zero. Then, We have
( 2 )
Σ
H1 = f b1i Ii = f (b11 I1 + b12 I2 ) (1.3.33)
i=1
( 2 )
Σ
H2 = f b2i Ii = f (b21 I1 + b22 I2 ) (1.3.34)
i=1
( 2 )
Σ
H3 = f b3i Ii = f (b31 I1 + b32 I2 ) (1.3.35)
i=1
⎛ ⎞
Σ
3
O1 = f ⎝ c1 j H j ⎠ = f (c11 H1 + c12 H2 + c13 H3 ) (1.3.36)
j=1
A computational graph of this feedforward neural network is shown in Fig. 1.33

with
v−1 = I1 , v0 = I2 (1.3.37)
v1 = b11 v−1 + b12 v0 , v2 = b21 v−1 + b22 v0 , v3 = b31 v−1 + b32 v0 (1.3.38)
v4 = f (v1 ), v5 = f (v2 ), v6 = f (v3 ) (1.3.39)
v7 = c11 v4 + c12 v5 + c13 v6 (1.3.40)
v8 = f (v7 ) = O1 (1.3.41)
42 1 Overview
Fig. 1.32 Three-layered O1

feedforward neural network
c11 c12 c13
b11 b32
I1 I2
Fig. 1.33 Computational O1

graph of three-layered
v8
f
v7
c11 c12 c13
v4 v5 v6
f f f
v1 v2 v3
b11 b32
v-1 v0
I1 I2
Let us calculate the derivative of the output with respect to the input using the
automatic differentiation method on this computational graph as
∂ O1
(1.3.42)
∂ I1
Two methods of automatic differentiation are available: the forward and the
reverse modes. First, let us try the forward mode, in which the derivative of vi
with respect to I1 is calculated sequentially as
∂vi ∂vi
v̇ı = = (1.3.43)
∂ I1 ∂v−1
Finally, the derivative of the output with respect to I1 is calculated as
∂v8 ∂v8 ∂ O1
v̇8 = = = (1.3.44)
∂v−1 ∂ I1 ∂ I1
The calculation process in the forward mode is shown in order as follows:
∂v−1
v̇−1 = =1 (1.3.45)
∂v−1
∂v0
v̇0 = =0 (1.3.46)
∂v−1
∂v1 ∂(b11 v−1 + b12 v0 )
v̇1 = = = b11 v̇−1 + b12 v̇0 = b11 (1.3.47)
∂v−1 ∂v−1
∂v2 ∂(b21 v−1 + b22 v0 )
v̇2 = = = b21 v̇−1 + b22 v̇0 = b21 (1.3.48)
∂v−1 ∂v−1
∂v3 ∂(b31 v−1 + b32 v0 )
v̇3 = = = b11 v̇−1 + b32 v̇0 = b31 (1.3.49)
∂v−1 ∂v−1
∂v4 ∂ f (v1 ) ∂ f (v1 ) ∂v1
v̇4 = = = = f ' (v1 )v̇1 (1.3.50)
∂v−1 ∂v−1 ∂v1 ∂v−1
∂v5 ∂ f (v2 ) ∂ f (v2 ) ∂v2
v̇5 = = = = f ' (v2 )v̇2 (1.3.51)
∂v−1 ∂v−1 ∂v2 ∂v−1
∂v−1 ∂ f (v3 ) ∂ f (v3 ) ∂v3
v̇6 = = = = f ' (v3 )v̇3 (1.3.52)
∂v−1 ∂v−1 ∂v3 ∂v−1
∂v7 ∂(c11 v4 + c12 v5 + c13 v6 )
v̇7 = = = c11 v̇4 + c12 v̇5 + c13 v̇6 (1.3.53)
∂v−1 ∂v−1
44 1 Overview
∂ f (v7 ) ∂ f (v7 ) ∂v7

v̇8 = = = f ' (v7 )v̇7 (1.3.54)
∂v−1 ∂v7 ∂v−1
Note that the calculations in the forward mode are performed sequentially from
the input side, and all the values required for each calculation are known since already
calculated at the precedent steps or easily calculated at the current step.
Now, let us try the reverse mode, where the derivative of O1 with respect to vi is
calculated sequentially as Fig. 1.33.
∂ O1
vi = (1.3.55)
∂vi
Finally, the derivative of the output with respect to I1 is calculated as follows:
∂ O1 ∂ O1
v −1 = = (1.3.56)
∂v−1 ∂ I1
The calculation process in the reverse mode is shown in order as
∂ O1 ∂ O1
v8 = = =1 (1.3.57)
∂v8 ∂ O1
∂ O1 ∂v8 ∂ f (v7 )
v7 = = = = f ' (v7 ) (1.3.58)
∂v7 ∂v7 ∂v7
∂ O1 ∂v8 ∂v8 ∂v7 ∂(c11 v4 + c12 v5 + c13 v6 )
v6 = = = = v7 = v 7 c13 (1.3.59)
∂v6 ∂v6 ∂v7 ∂v6 ∂v6
∂ O1 ∂v8 ∂v8 ∂v7 ∂(c11 v4 + c12 v5 + c13 v6 )
v5 = = = = v7 = v 7 c12 (1.3.60)
∂v5 ∂v5 ∂v7 ∂v5 ∂v5
∂ O1 ∂v8 ∂v8 ∂v7 ∂(c11 v4 + c12 v5 + c13 v6 )
v4 = = = = v7 = v 7 c11 (1.3.61)
∂v4 ∂v4 ∂v7 ∂v4 ∂v4
∂ O1 ∂ O1 ∂v6 ∂ f (v3 )
v3 = = = v6 = v 6 f ' (v3 ) (1.3.62)
∂v3 ∂v6 ∂v3 ∂v3
∂ O1 ∂ O1 ∂v5 ∂ f (v2 )
v2 = = = v5 = v 5 f ' (v2 ) (1.3.63)
∂v2 ∂v5 ∂v2 ∂v2
∂ O1 ∂ O1 ∂v4 ∂ f (v1 )
v1 = = = v4 = v 4 f ' (v1 ) (1.3.64)
∂v1 ∂v4 ∂v1 ∂v1
∂ O1 ∂ O1 ∂v1 ∂ O1 ∂v2 ∂ O1 ∂v3
v̄ 0 = = + +
∂v0 ∂v1 ∂v0 ∂v2 ∂v0 ∂v3 ∂v0
∂v1 ∂v2 ∂v3 ∂(b11 v−1 + b12 v0 )
= v̄ 1 + v̄ 2 + v̄ 3 = v̄ 1
∂v0 ∂v0 ∂v0 ∂v0
References 45
∂(b21 v−1 + b22 v0 ) ∂(b31 v−1 + b32 v0 )

+ v̄ 2 + v̄ 3 = v̄ 1 b12 + v̄ 2 b22 + v̄ 3 b32
∂v0 ∂v0
(1.3.65)
∂ O1 ∂ O1 ∂v1 ∂ O1 ∂v2 ∂ O1 ∂v3
v̄ −1 = = + +
∂v−1 ∂v1 ∂v−1 ∂v2 ∂v−1 ∂v3 ∂v−1
∂v1 ∂v2 ∂v3 ∂(b11 v−1 + b12 v0 )
= v̄ 1 + v̄ 2 + v̄ 3 = v̄ 1
∂v−1 ∂v−1 ∂v−1 ∂v−1
∂(b21 v−1 + b22 v0 ) ∂(b31 v−1 + b32 v0 )
+ v̄ 2 + v̄ 3 = v̄ 1 b11 + v̄ 2 b21 + v̄ 3 b31
∂v−1 ∂v−1
(1.3.66)
Note that the calculations in the reverse mode are performed sequentially from the
output side, and all the values required for each calculation are known when needed.
Since the error back propagation algorithm, which is the standard training method
for neural networks and deep learning, often uses partial differentiation with respect
to parameters, many deep learning libraries have an automatic differentiation func-
tion. In particular, the automatic differentiation method in the reverse mode is
closely related to the error back propagation algorithm [3]. It is noted that auto-
matic differentiation plays an important role in physics-informed neural networks
(Sect. 2.4.3).
References
1. Amari, S.: A theory of adaptive pattern classifiers. IEEE Trans. Electron. Comput. EC-16,
299–307 (1967)
2. Arjovsky, M., Chintala, S., Bottou, L.: Wasserstein GAN, arXiv: 1701.07875, (2017)
3. Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in
machine learning: a survey. J. Mach. Learn. Res. 18, 5595–5637 (2018)
4. Bengio, Y., Lamblin, P., Popovici, D., Larochelle, H.: Greedy layer-wise training of deep
networks. Proceedings of NIPS, (2006)
5. Bishop, C.M.: Pattern Recognition and Machine Learning. Springer (2006)
6. Biswas, A., Chandrakasan, A.P.: Conv-RAM: An energy-efficient SRAM with embedded
convolution computation for low-power CNN-based machine learning applications, 2018 IEEE
International Solid - State Circuits Conference - (ISSCC), 2018, pp. 488–490, https://doi.org/
10.1109/ISSCC.2018.8310397
7. Breiman, L.: Random forests, Machine Learning. 45(1), 5–32 (2001)
8. Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A.,
Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henighan, T.,
Child, R., Ramesh, A., Ziegler, D.M., Wu, J., Winter, C., Hesse, C., Chen, M., Sigler, E.,
Litwin, M., Gray, S., Chess, B., Clark, J., Berner, C., McCandlish, S., Radford, A., Sutskever,
I., Amodei, D.: Language models are few-shot learners. arXiv: 2005.14165, (2020)
9. Chen, X., Duan, Y., Houthooft, R., Schulman, J., Sutskever, I., Abbeel, P.: InfoGAN: Inter-
pretable representation learning by information maximizing generative adversarial nets. arXiv:
1606.03657, (2016)
10. Ciresan, D., Meier, U., Masci, J., Schmidhuber, J.: Multi-column deep neural network for traffic
sign classification. Neural Netw. 32, 333–338 (2012)
46 1 Overview
11. Conneau, A., Baevski, A., Collobert, R., Mohamed, A., Auli, M.: Unsupervised cross-lingual
representation learning for speech recognition. arXiv: 2006.13979, (2020)
12. Courbariaux, M., Bengio, Y., David, J.P.: Binaryconnect: Training deep neural networks with
binary weights during propagations. Adv. Neural Inf. Process. Sys. 28, 3105–3113 (2015)
13. Courbariaux, M., David, J.P., Bengio. Y.: Low precision storage for deep learning, arXiv:
1412.7024, (2014)
14. Courbariaux, M., Hubara, I., Soudry, D., El-Yaniv, R., Bengio. Y.: Binarized neural networks:
Training deep neural networks with weights and activations constrained to +1 or −1, arXiv:
1602.02830, (2016)
15. Deng, C., Liao, S., Xie, Y., Parhi, K.K., Qian, X., Yuan, B.: PermDNN: efficient compressed
DNN architecture with permuted diagonal matrices, Proceedings of the 51st Annual IEEE/ACM
International Symposium on Microarchitecture (MICRO-51), 2018, pp. 189–202, https://doi.
org/10.1109/MICRO.2018.00024
16. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional
transformers for language understanding, arXiv: 1810.04805, (2018)
17. Elman, J.L., Zipser, D.: Learning the hidden structure of speech. Journal of the Acoustical
Society of America 83, 1615–1626 (1988)
18. Fukushima, K.: Neocognitron: a self-organizing neural network model for a mechanism of
pattern recognition unaffected by shift in position. Biol. Cybern. 36(4), 193–202 (1980)
19. Funahashi, K.: On the approximate realization of continuous mappings by neural networks.
Neural Netw. 2, 183–192 (1989)
20. Gong, J., Shen, H., Zhang, G., Liu, X., Li, S., Jin, G., Maheshwari, N., Fomenko, E., Segal,
E.: Highly efficient 8-bit low precision inference of convolutional neural networks with Intel-
Caffe, In Proceedings of the 1st on Reproducible Quality-Efficient Systems Tournament on
Co-designing Pareto-efficient Deep Learning (ReQuEST ‘18). Association for Computing
Machinery, New York, NY, USA, Article 2, 1. https://doi.org/10.1145/3229762.3229763
21. Gonzalez, R.C., Woods, R.E.: Digital Image Processing (Second Edition). Prentice-Hall (2002)
22. Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press (2016)
23. Goodfellow, I.J., Shlens, J. Szegedy, C.: Explaining and harnessing adversarial examples. arXiv:
1412.6572, (2014)
24. Goodfellow, I. J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Bengio,
Y.: Generative adversarial networks. arXiv: 1406.2661, (2014)
25. Gupta, S., Agrawal, A., Gopalakrishnan, K., Narayanan, P.: Deep learning with limited numer-
ical precision. Proceedings of the 32nd International Conference on Machine Learning, Lille,
France, 2015.
26. He, K., Zhang, X., Ren, S., Sun, J.: Deep Residual Learning for Image Recognition. 2016
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, 2016,
pp. 770–778, https://doi.org/10.1109/CVPR.2016.90.
27. Heykin, S.: Neural Networks: A comprehensive Foundation. Prentice Hall (1999)
28. Hinton, G.E., Osindero, S., Teh, Y.: A fast learning algorithm for deep belief nets. Neural
Comput. 18, 1527–1544 (2006)
29. Hornik, K., Stinchcombe, M., White, H.: Multilayer feedforward networks are universal
approximators. Neural Netw. 2, 359–366 (1989)
30. Hu, J., Shen, L., Sun, G.: Squeeze-and-Excitation Networks. 2018 IEEE/CVF Conference on
Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018, pp. 7132–7141,
https://doi.org/10.1109/CVPR.2018.00745.
31. Hubel, D.H., Wiesel, T.N.: Receptive fields, binocular interaction and functional architecture
in cat’s visual cortex. J. Physiol. 160, 106–154 (1962)
32. Hughes, T.J.R.: The Finite Element Method : Linear Static and Dynamic Finite Element
Analysis. Dover (2000)
33. Ioffe, S., Szegedy, C.: Batch normalization: Accelerating deep network training by reducing
internal covariate shift. In International conference on machine learning (pp. 448–456). PMLR,
2015.
References 47
34. Jaitly, N., Hinton, G.: Vocal Tract Length Perturbation (VTLP) improves speech recognition.
in ICML Workshop on Deep Learning for Audio, Speech and Language Processing, 2013.
35. Karras, T., Aila, T., Laine, S., Lehtinen, J.: Progressive growing of GANs for improved quality,
stability, and variation. arXiv: 1710.10196, (2017)
36. Karras, T., Laine, S., Aila, T.: A style-based generator architecture for generative adversarial
networks. arXiv: 1812.04948, (2018)
37. Kingma, D.P., Welling, M.: Auto-encoding variational Bayes. arXiv: 1312.6114, (2013)
38. Kingma, D.P., Welling, M.: An Introduction to Variational Autoencoders. Found. Trends Mach.
Learn. 12(4), 307–392 (2019). https://doi.org/10.1561/2200000056
39. Kirk, D.B., Hwu, W.W.: Programming Massively Parallel Processors: A Hands-on Approach.
Morgan Kaufmann (2010)
40. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional
neural networks. In NIPS’ 2012, (2012)
41. Kubo, S.: Inverse problems related to the mechanics and fracture of solids and structures. JSME
Int. J. 31(2), 157–166 (1988)
42. LeCun, Y.: Generalization and network design strategies. Technical Report CRG-TR-89-4,
Department of Computer Science, University of Toronto (1989)
43. LeCun, Y., Bengio, Y., Hinton, G.E.: Deep learning. Nature 521, 436–444 (2015)
44. Mao, X., Li, Q., Xie, H., Lau, R.Y.K., Wang, Z., Smolley, S.P.: Least squares generative
adversarial networks. arXiv: 1611.04076, (2016)
45. McCulloch, W.S., Pitts, W.: A logical calculus of the ideas immanent in nervous activity. Bull.
Math. Biophys. 5, 115–133 (1943)
46. Michalewicz, Z.: Genetic Algorithms + Data Structures = Evolution Programs. Springer-
Verlag (1992)
47. Minsky, M.L., Papert, S.A.: Perceptrons. MIT Press (1969)
48. Mizra, M., Osindero, S.: Conditional generative adversarial nets. arXiv: 1411.1784, (2014)
49. Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT Press (2012)
50. Oishi, A., Yagawa, G., Computational mechanics enhanced by deep learning. Comput. Methods
Appl. Mech. Eng. 327, 327–351 (2017)
51. Oishi, A., Yamada, K., Yoshimura, S., Yagawa, G.: Quantitative nondestructive evaluation with
ultrasonic method using neural networks and computational mechanics. Comput. Mech. 15,
521–533 (1995)
52. Oishi, A., Yamada, K., Yoshimura, S., Yagawa, G., Nagai, S., Matsuda, Y.: Neural network-
based inverse analysis for defect identification with laser ultrasonics. Res. Nondestruct. Eval.
13(2), 79–95 (2001)
53. Park, D.S., Chan, W., Zhang, Y., Chiu, C., Zoph, B., Cubuk, E.D., Le, Q.V.: SpecAugment: A
simple data augmentation method for automatic speech recognition. Proceedings of Interspeech
2019, pp. 2613–2617, https://doi.org/10.21437/Interspeech.2019-2680
54. Ping, W., Peng, K., Gibiansky, A., Arik, S.O., Kannan, A., Narang, S., Raiman, J., Miller, J.:
Deep voice 3: Scaling text-to-speech with convolutional sequence learning. arXiv: 1710.07654,
(2018)
55. Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolu-
tional generative adversarial networks. arXiv: 1511.06434, (2015)
56. Rosenblatt, F.: The perceptron: A probabilistic model for information storage and organization
in the brain. Psychol. Rev. 65, 386–408 (1958)
57. Rosenblatt, F.: On the convergence of reinforcement procedures in simple perceptrons. Cornell
Aeronautical Laboratory Report, VG-1196-G-4, (1960)
58. Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by back-propagating
errors. Nature, 323, 533–536 (1986)
59. Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A.,
Khosla, A., Bernstein, M., Berg, A.C., Fei-Fei, L.: ImageNet Large Scale Visual Recognition
Challenge. Int. J. Comput. Vision 115, 211–252 (2015). https://doi.org/10.1007/s11263-015-
0816-y
48 1 Overview
60. Sietsma, J., Dow, R.: Creating artificial neural networks that generalize. Neural Netw. 4, 67–79
(1991)
61. Silver, D., Huang, A., Maddison, C.J., Guez, A., Sifre, L., van den Driessche, G., Schrittwieser,
J., Antonoglou, I., Panneershelvam, V., Lanctot, M., Dieleman, S., Grewe, D., Nham, J., Kalch-
brenner, N., Sutskever, I., Lillicrap, T., Leach, M., Kavukcuoglu, K., Graepel, T., Hassabis, D.:
Mastering the game of Go with deep neural networks and tree search. Nature 529, 484–489
(2016)
62. Silver, D., Schrittwieser, J., Simonyan, K., Antonoglou, I., Huang, A., Guez, A., Hubert, T.,
Baker, L., Lai, M., Bolton, A., Chen, Y., Lillicrap, T., Hui, F., Sifre,L., van den Driessche, G.,
Graepel, T., Hassabis, D.: Mastering the game of Go without human knowledge. Nature 550,
354–359 (2017)
63. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image
recognition. ICLR 2015, arXiv: 1409.1556, (2015)
64. Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple
way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)
65. Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. Adv.
Neural Inf. Process. Sys. 27, 3104–3112 (2014).
66. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V.,
Rabinovich, A.: Going deeper with convolutions. IEEE Conference on Computer Vision and
Pattern Recognition (CVPR), 2015, pp. 1–9, https://doi.org/10.1109/CVPR.2015.7298594
67. Ueberhuber, C.W.: Numerical Computation 1: Methods, Software, and Analysis. Springer
(1997)
68. Wan, L., Zeiler, M., Zhang, S., Le Cun, Y., Fergus, R.: Regularization of neural networks
using DropConnect. Proceedings of the 30th International Conference on Machine Learning,
in PMLR 28(3), 2013, pp. 1058–1066
69. Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet D.,
Pajdla T., Schiele B., Tuytelaars T. (eds) Computer Vision – ECCV 2014. ECCV 2014. Lecture
Notes in Computer Science, vol 8689. Springer, Cham. https://doi.org/10.1007/978-3-319-
10590-1_53
70. Zeng, X., Ouyang, W., Yan, J., Li, H., Xiao, T., Wang, K., Liu, Y., Zhou, Y., Yang, B., Wang,
Z., Zhou, H., Wanget, X.: Crafting GBD-Net for object detection. IEEE Trans. Pattern Anal.
Mach. Intell. 40(09), 2109–2123 (2018)
71. Zhu, J.-Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-
consistent adversarial networks. arXiv: 1703.10593, (2017)
72. Zienkiewicz, O.C., Morgan, K.: Finite Elements and Approximation. Dover (2006)
Chapter 2
Mathematical Background for Deep
Learning
Abstract This chapter deals with the operation of neural networks and deep learning
in detail using mathematical formulas. Section 2.1 explains the feedforward neural
network including the error back propagation algorithm, Sect. 2.2 the convolutional
neural networks, which have become the mainstream of deep learning in recent years,
and Sect. 2.3 compares various methods for accelerating the training process. Finally,
Sect. 2.4 describes regularization methods to suppress overtraining for improving
performance of the trained neural networks.
2.1 Feedforward Neural Network
Regarding a fully connected feedforward neural network, which is the most basic
neural network, both the forward and the error back propagation processes are
discussed here using mathematical expressions.
In general, a feedforward neural network has a hierarchical structure of units with
nonlinear transformation functions [4], and the input–output relationship of the j-th
unit of the l-th layer is expressed as
( )
O lj = f U lj (2.1.1)
where
O lj : output value of the activation function of the j-th unit in the l-th layer,
U lj : input value to the activation function of the j-th unit in the l-th layer,
f (): activation function.
Note that U lj is expressed using the input from the units in the previous layer as
follows:
Σ
nl−1
U lj = wl−1
ji · Oi
l−1
+ θ lj (2.1.2)
i=1

https://doi.org/10.1007/978-3-031-11847-0_2
50 2 Mathematical Background for Deep Learning
where
n l−1 : the number of units in the (l − 1)-th layer,
wl−1 ji : the connection weight between the i-th unit in the (l − 1)-th layer and the
j-th unit in the l-th layer,
Oil−1 : the output of the i-th unit in the (l − 1)-th layer,
θ lj : the bias of the j-th unit in the l-th layer.
As for the activation functions f (), the following functions are often employed.
1
f (x) = Sigmoid function (2.1.3)
1 + e−x
ex − e−x
f (x) = tanh x = Hyperbolic tangent function (2.1.4)
ex + e−x
{
x (x ≥ 0)
f (x) = Rectified linear unit ( ReLU) function (2.1.5)
0 (x < 0)
The graphs of above functions are shown in Fig. 2.1.

These functions are known to have the common feature that the first-order deriva-
tive of the function required for the error back propagation algorithm can be easily
obtained. For example, the first-order derivative of the sigmoid function is calculated
as follows:
( )' ( )
df 1 e−x 1 + e−x − 1
= = = = f (x)(1 − f (x)) (2.1.6)
dx 1 + e−x (1 + e−x )2 (1 + e−x )2
Regarding the hyperbolic tangent function and the ReLU function, we, respec-
tively, have
Fig. 2.1 Activation 5

functions sigmoid(x)
4 tanh(x)
ReLU(x)
-1
-4 -2 0 2 4
2.1 Feedforward Neural Network 51
Fig. 2.2 First derivatives of 1.5

activation functions sigmoid(x)
tanh(x)
ReLU(x)
1.0
0.5
0.0
-0.5
-4 -2 0 2 4
( )' ( x )2
df ex − e−x e − e−x
= =1− = 1 − ( f (x))2 (2.1.7)
dx ex + e−x (1 + e−x )2
{
df 1 (x ≥ 0)
= (2.1.8)
dx 0 (x < 0)
The first-order derivatives of these three function are shown in Fig. 2.2, where the
derivative of the sigmoid function is valued to be at most 0.25, which is smaller than
the derivatives of the others.
Note that the following function is usually used as the activation function of input
units,
f (x) = x (2.1.9)
and that a linear function is often used to be the activation function of output units as
f (x) = ax + b (2.1.10)
The input data to a feedforward neural network undergo a nonlinear transformation

in a unit in each layer, and the output of a unit in the output layer becomes the output
of the neural network. Then, the output is compared with the corresponding teacher
signal and the error is evaluated. The most common error used for a feedforward
neural network is the squared error shown as follows:
1Σ P Σ n n
L
( p L p )2
E= O j − Tj (2.1.11)
2 p=1 j=1
where
p L
O j : the output of the j-th unit in the output at the L-th layer (output layer) for
the p-th training pattern,
p
T j : the teacher signal corresponding to the output of the j-th unit in the output
layer for the p-th training pattern,
n L : the number of units in the L-th layer (output layer),
n P : the total number of training patterns.
In stochastic gradient descent methods, the error for each pattern is often used instead
of Eq. (2.1.11) as
1Σ
n
L
( p L p )2
E= O j − Tj (2.1.12)
2 j=1
From Eqs. (2.1.11) and (2.1.12), this error is regarded as a function of the
connection weights and the biases. Then, it is written as
E = E(w, θ ) (2.1.13)
where w is a vector of all the connection weights and θ a vector of all the biases.
Finding w and θ that minimize the error is called training or learning of a feedforward
neural network. The error back propagation algorithm based on the steepest descent
method is widely used for training, where the connection weights and biases are
iteratively updated based on the gradient of the error as
∂ E(w, θ )
Δwl−1
ji = − (2.1.14)
∂wl−1
ji
wl−1
ji = w ji + α · Δw ji
l−1 l−1
(2.1.15)
∂ E(w, θ )
Δθ lj = − (2.1.16)
∂θ lj
θ lj = θ lj + β · Δθ lj (2.1.17)
where
Δwl−1 ji : the amount of update of the connection weight between the j-th unit in
the l-th layer and the i-th unit in the (l − 1)-th layer,
Δθ lj : the amount of update of the bias at the j-th unit in the l-th layer,
α: the learning coefficient for update of the connection weight,
β: the learning coefficient for update of the bias.
Fig. 2.3 Four-layer

As an example, the error back propagation algorithm in a 4-layer (L = 4) feedforward

neural network (Fig. 2.3) is discussed here, where the error to be minimized is given
as
1Σ
n
4
( L )2
E= O j − Tj (2.1.18)
2 j=1
Let us start with the calculation of the amount of update of connection weights.
∂E
First, ∂w 3 is calculated as follows:
ab
1Σ
n
∂E 4
∂ ( 4 )2
= O j − Tj
∂wab
3 2 j=1 ∂wab
3
Σ
n4
( ) ∂ O 4j
= O 4j − T j
j=1
∂wab
3
( ) ∂ Oa4
= Oa4 − Ta (2.1.19)
∂wab3
( 4)
∂ Oa4 ∂ f Ua
=
∂wab
3
∂wab
3
( 4)
∂ f Ua ∂Ua4
=
∂Ua4 ∂wab
3
( 4 ) (Σn 3 )
∂ f Ua ∂ i=1 wai · Oi + θa
3 3 4
=
∂Ua4 ∂wab3
( ) n3
∂ f Ua4 Σ ∂wai
3
= Oi3
∂Ua4 i=1 ∂wab
3
( )
∂ f Ua4 3
= Ob (2.1.20)
∂Ua4
Substituting Eq. (2.1.20) into Eq. (2.1.19), we have

( )
∂E ( 4 ) ∂ f Ua4 3
= Oa − Ta Ob (2.1.21)
∂wab
3 ∂Ua4
∂E
Similarly, ∂wcd
2 is calculated as follows:
1Σ
n
∂E 4
∂ ( 4 )2
= O j − Tj
∂wcd
2 2 j=1 ∂wcd
2
Σ
n4
( ) ∂ O 4j
= O 4j − T j (2.1.22)
j=1
∂wcd
2
( )
∂ O 4j ∂ f U 4j
=
∂wcd
2
∂wcd
2
( )
∂ f U 4j ∂U 4
j
=
∂U 4j ∂wcd
2
( ) (Σ )
n3
∂ f U 4j ∂ i=1 w ji · Oi + θa
3 3 4
=
∂U 4j ∂wcd
2
( )
∂ f U 4j Σ
n3
∂ Oi3
= w 3ji (2.1.23)
∂U 4j i=1
∂wcd
2
( )
∂ Oi3 ∂ f Ui3
=
∂wcd
2
∂wcd
2
( 3)
∂ f Ui ∂Ui3
=
∂Ui3 ∂wcd 2
( 3 ) (Σn 2 )
∂ f Ui ∂ k=1 wik · Ok + θi
2 2 3
=
∂Ui3 ∂wcd2
( 3 ) n2
∂ f Ui Σ ∂wik 2
= Ok2
∂Ui3 k=1 ∂wcd 2
( )
∂ f Ui3 ∂wid
2
= Od2 (2.1.24)
∂Ui3 ∂wcd
2
Substituting Eqs. (2.1.23) and (2.1.24) into Eq. (2.1.22), we obtain

( )
( 3)
)∂ f Uj Σ
4
Σn4
( 4 n3
∂E 3 ∂ f Ui ∂wid
2
= O j − T j w ji Od2
∂wcd
2
j=1
∂U 4
j i=1
∂U i
3
∂w 2
cd
( )
( )
)∂ f Uj
4
Σn4
( 4 3 ∂ f Uc
3
= O j − Tj w Od2 (2.1.25)
j=1
∂U 4
j
jc
∂U 3
c
∂E
Further, ∂weg
1 is calculated as
1Σ
n
∂E 4
∂ ( 4 )2
= O j − Tj
∂weg
1 2 j=1 ∂weg
1
Σ
n4
( ) ∂ O 4j
= O 4j − T j (2.1.26)
j=1
∂weg
1
( )
∂ O 4j ∂ f U 4j
=
∂weg
1 ∂weg
1
( )
∂ f U 4j ∂U 4
j
=
∂U 4j ∂weg
1
( ) (Σ )
n3
∂ f U 4j ∂ i=1 w ji · Oi + θa
3 3 4
=
∂U 4j ∂weg
1
( )
∂ f U 4j Σ
n3
∂ Oi3
= w 3ji (2.1.27)
∂U 4j i=1 ∂weg
1
( )
∂ Oi3 ∂ f Ui3
=
∂weg
1 ∂weg
1
( 3)
∂ f Ui ∂Ui3
=
∂Ui3 ∂weg1
( 3 ) (Σn 2 )
∂ f Ui ∂ k=1 wik · Ok + θi
2 2 3
=
∂Ui3 ∂weg1
( ) n2
∂ f Ui3 Σ ∂ Ok2
= wik
2
(2.1.28)
∂Ui3 k=1 ∂weg
1
( )
∂ Ok2 ∂ f Uk2
=
∂weg
1 ∂weg
1
( 2)
∂ f Uk ∂Uk2
=
∂Uk3 ∂weg 1
( 2 ) (Σn 1 1 )
∂ f Uk ∂ l=1 wkl · Ol + θi
1 2
=
∂Uk2 ∂weg
1
( 2 ) n1
∂ f Uk Σ ∂wkl 1
= Ol1
∂Uk2 l=1 ∂weg 1
( )
∂ f Uk2 ∂wkg1
= Og1 (2.1.29)
∂Uk2 ∂weg 1
Substituting Eqs. (2.1.27), (2.1.28), and (2.1.29) into Eq. (2.1.26), we achieve
( )
( ) n2 ( )
∂E Σ
n4
( ) ∂ f U 4j Σ
n3
∂ f Ui3 Σ ∂ f Uk2 ∂wkg
1
= O 4j − Tj w 3ji wik
2
Og1
∂weg
1
j=1
∂U 4j i=1
∂Ui3 k=1
∂Uk2 ∂weg
1
( )
( ) ( 2)
)∂ f Uj Σ
4
Σ
n4
( n3
∂ f Ui3 2 ∂ f Ue
= O 4j − T j w 3ji wie Og1
j=1
∂U 4j i=1
∂Ui3 ∂Ue2
(2.1.30)
From these calculations above, the amount of updates of the connection weights is
written as follows:
( )
∂E ( 4 ) ∂ f U 4j
Δw ji = − 3 = − O j − T j
3
Oi3 (2.1.31)
∂w ji ∂U 4j
( )
( 4)
∂ E Σn4
( ) ∂ f U ∂ f U 3j
k
Δw 2ji = − 2 = − Ok4 − Tk wk3j Oi2 (2.1.32)
∂w ji k=1
∂U 4
k ∂U 3
j
∂E
Δw 1ji = −
∂w 1ji
( )
( ) n3 ( )
Σ
n4
( ) ∂ f Uk4 Σ ∂ f Ul3 ∂ f U 2j
=− Ok4 − Tk wkl
3
wl2j Oi1 (2.1.33)
k=1
∂Uk4 l=1
∂Ul3 ∂U 2j
∂E
Next, let us calculate the amount of update of biases. First, ∂θa4
is calculated as
follows:
1Σ
n
∂E 4
∂ ( 4 )2
= O j − Tj
∂θa
4 2 j=1 ∂θa
4
Σ
n4
( ) ∂ O 4j
= O 4j − T j
j=1
∂θa4
( ) ∂ Oa4
= Oa4 − Ta (2.1.34)
∂θa4
( )
∂ Oa4 ∂ f Ua4
=
∂θa4 ∂θa4
( )
∂ f Ua4 ∂Ua4
=
∂Ua4 ∂θa4
( ) (Σn 3 )
∂ f Ua4 ∂ i=1 wai · Oi + θa
3 3 4
=
∂Ua4 ∂θa4
( 4)
∂ f Ua
= (2.1.35)
∂Ua4
Substituting Eq. (2.1.35) into Eq. (2.1.34), we obtain

( )
∂E ( 4 ) ∂ f Ua4
= Oa − Ta (2.1.36)
∂θa4 ∂Ua4
∂E
Similarly, ∂θb3
is calculated as
1Σ
n
∂E 4
∂ ( 4 )2
= O j − Tj
∂θb
3 2 j=1 ∂θb
3
Σ
n4
( ) ∂ O 4j
= O 4j − T j (2.1.37)
j=1
∂θb3
( )
∂ O 4j ∂ f U 4j
=
∂θb3 ∂θb3
( )
∂ f U 4j ∂U 4
j
=
∂U 4j ∂θb3
( ) (Σ )
n3
∂ f U 4j ∂ i=1 w ji · Oi + θ j
3 3 4
=
∂U 4j ∂θb3
( )
∂ f U 4j Σ
n3
∂ Oi3
= w 3ji
∂U 4j i=1
∂θb3
( )
∂ f U 4j ∂ Ob3
= w 3jb (2.1.38)
∂U 4j ∂θb3
( )
∂ Ob3 ∂ f Ub3
=
∂θb3 ∂θb3
( )
∂ f Ub3 ∂Ub3
=
∂Ub3 ∂θb3
( ) (Σn 2 )
∂ f Ub3 ∂ k=1 wbk · Ok + θb
2 2 3
=
∂Ub3 ∂θb3
( 3)
∂ f Ub
= (2.1.39)
∂Ub3
Substituting Eqs. (2.1.38) and (2.1.39) into Eq. (2.1.37), we have

( )
( 3)
)∂ f Uj
4
Σn4
( 4
∂E 3 ∂ f Ub
= O j − Tj w jb (2.1.40)
∂θb3 j=1
∂U 4j ∂Ub3
∂E
Further, ∂θc2
is calculated as follows:
1Σ
n
∂E 4
∂ ( 4 )2
= O − Tj
∂θc2 2 j=1 ∂θc2 j
Σ
n4
( ) ∂ O 4j
= O 4j − T j (2.1.41)
j=1
∂θc2
( )
∂ O 4j ∂ f U 4j
=
∂θc2 ∂θc2
( )
∂ f U 4j ∂U 4
j
=
∂U 4j ∂θc2
( ) (Σ )
n3
∂ f U 4j ∂ i=1 w 3
ji · O 3
i + θ 4
j
=
∂U 4j ∂θc2
( )
∂ f U 4j Σ
n3
∂ Oi3
= w 3ji (2.1.42)
∂U 4j i=1
∂θc2
( )
∂ Oi3 ∂ f Ui3
=
∂θc2 ∂θc2
( )
∂ f Ui3 ∂Ui3
=
∂Ui3 ∂θc2
( ) (Σn 2 )
∂ f Ui3 ∂ k=1 wik · Ok + θi
2 2 3
=
∂Ui3 ∂θc2
( 3 ) n2
∂ f Ui Σ 2 ∂ Ok2
= w (2.1.43)
∂Ui3 k=1 ik ∂θc2
( )
∂ Ok2 ∂ f Uk2
=
∂θc2 ∂θc2
( )
∂ f Uk2 ∂Uk2
=
∂Uk2 ∂θc2
( ) (Σn 1 1 )
∂ f Uk2 ∂ l=1 wkl · Ol + θk
1 2
=
∂Uk2 ∂θc2
( 2) 2
∂ f Uk ∂θk
= (2.1.44)
∂Uk2 ∂θc2
Substituting Eqs. (2.1.42), (2.1.43), and (2.1.44) into Eq. (2.1.41), we obtain
( )
( ) n2 ( )
∂E Σ
n4
( ) ∂ f U 4j Σ
n3
∂ f Ui3 Σ ∂ f Uk2 ∂θk2
= O 4j − Tj w 3ji wik
2
∂θc2 j=1
∂U 4j i=1 ∂Ui3 k=1
∂Uk2 ∂θc2
( )
( ) ( 2)
Σ
n4
( ) ∂ f U 4j Σ
n3
∂ f Ui3 2 ∂f Uc
= O 4j − Tj w 3ji wic (2.1.45)
j=1
∂U 4j i=1
∂Ui3 ∂Uc2
After these calculations above, the amount of updates of the biases is given as follows:
( )
∂E ( 4 ) ∂ f Ui4
Δθi4 = −
= − O i − T i (2.1.46)
∂θi4 ∂Ui4
( )
( 3)
Σn4
( 4 ) ∂ f U 4j
∂E 3 ∂ f Ui
Δθi = − 3 = −
3
O j − Tj w ji (2.1.47)
∂θi j=1
∂U 4j ∂Ui3
( )
( 3) ( 2)
Σn4
( ) ∂ f U 4j Σn3
∂ E 3 ∂ f Uk 2 ∂ f Ui
Δθi = − 2 = −
2
O j − Tj
4
w wki (2.1.48)
∂θi j=1
∂U 4j k=1 jk ∂Uk3 ∂Ui2
2.2 Convolutional Neural Network
In this section, both the forward propagation and the error back propagation of the
convolutional layer in the convolutional neural networks are studied in detail using
mathematical expressions.
Now, the convolutional layer [2, 8], in practice, consists of three kinds of layers:
a convolutional layer, an activation function layer, and a pooling layer.
A convolutional layer is often employed for the image data, which are regarded
as a two-dimensional array, meaning that the units in a convolutional layer of a
convolutional neural network are two-dimensionally arranged. Figure 2.4 shows its
schematic illustration, which takes two-dimensional data of the size M×N as input
and outputs two-dimensional data of the size M×N with a filter of the size S×T.
Connections in the convolutional layer are defined as
Σ T −1
S−1 Σ
p−1 p−1
p
Umn = h st · Om+s,n+t + θmn
p
(2.2.1)
s=0 t=0
S
T
N M
Fig. 2.4 Convolutional layer

2.2 Convolutional Neural Network 61
p−1
where Om+s,n+t is the output of the (m + s, n + t)-th unit in the (p-1)-th layer, where
p−1
units are arranged in a two-dimensional manner, h st the (s, t)-th component of the
p
filter of S×T size for the (p−1)-th layer, θmn the bias of the (m, n)-th unit in the p-th
p
layer and Umn the input value to the activation function of the (m, n)-th unit in the
p-th layer, where units are also arranged in a two-dimensional manner.
For example, when S = T = 3, the summation part of the right-hand side of
Eq. (2.2.1) is the sum of all components of the matrix as
⎛ p−1 p−1 p−1 p−1 p−1 p−1
⎞
h 0,0 Om+0,n+0 h 1,0 Om+1,n+0 h 2,0 Om+2,n+0
⎜ p−1 p−1 p−1 p−1 p−1 p−1 ⎟
⎝ h 0,1 Om+0,n+1 h 1,1 Om+1,n+1 h 2,1 Om+2,n+1 ⎠
p−1 p−1 p−1 p−1 p−1 p−1
h 0,2 Om+0,n+2 h 1,2 Om+1,n+2 h 2,2 Om+2,n+2
⎛ p−1 p−1 p−1
⎞ ⎛ p−1 p−1 p−1 ⎞
Om+0,n+0 Om+1,n+0 Om+2,n+0 h h 1,0 h 2,0
⎜ p−1 p−1 p−1 ⎟ ⎜ 0,0 p−1 p−1 p−1 ⎟
= ⎝ Om+0,n+1 Om+1,n+1 Om+2,n+1 ⎠ ʘ ⎝ h 0,1 h 1,1 h 2,1 ⎠ (2.2.2)
p−1 p−1 p−1 p−1 p−1 p−1
Om+0,n+2 Om+1,n+2 Om+2,n+2 h 0,2 h 1,2 h 2,2
where ʘ means the product of the corresponding components of the two matrices,
called the Hadamard product.
p−1 p
In Eq. (2.2.1), h st and θmn are parameters, which are to be updated by the error
back propagation algorithm. The update rule for each parameter is written as follows:
p−1 p−1 p−1 p−1 ∂E

h st ← h st + α1 Δh st = h st − α1 p−1
(2.2.3)
∂h st
p p p p ∂E
θmn ← θmn + α2 Δθmn = θmn − α2 p (2.2.4)
∂θmn
where α1 and α2 are the learning coefficients. The derivative in Eq. (2.2.3) is
calculated as follows:
ΣΣ
M−1 N −1 p
∂E ∂E ∂Umn
= p ·
p−1
∂h st m=0 n=0
∂Umn ∂h p−1
st
(Σ )
S−1 ΣT −1 p−1 p−1 p
Σ Σ ∂E
M−1 N −1 ∂ s=0 t=0 h st · O m+s,n+t + θmn
= p ·
m=0 n=0
∂U mn ∂h
p−1
st
ΣΣ
M−1 N −1
∂E p−1
= p · Om+s,n+t (2.2.5)
m=0 n=0
∂Umn
Similarly, the derivative in Eq. (2.2.4) is

ΣΣ
M−1 N −1 p
∂E ∂E ∂Umn
p = p · p
∂θmn m=0 n=0
∂Umn ∂θmn
(Σ )
S−1 ΣT −1 p−1 p−1 p
ΣΣ
M−1 N −1
∂E ∂ s=0 t=0 h st · Om+s,n+t + θmn
= p · p
m=0 n=0
∂Umn ∂θmn
ΣΣ
M−1 N −1
∂E
= p · δmm δnn
m=0 n=0
∂Umn
∂E
= p (2.2.6)
∂Umn
where δi j is Kronecker’s delta given as follows:

{
1 (i = j)
δi j = (2.2.7)
0 (i /= j)
p
If a common bias value is used within the same layer, i.e. θmn ≡ θ p as taken often
in practice, then Eq. (2.2.6) turns into
ΣΣ
M−1 N −1
∂E ∂E
= p (2.2.8)
∂θ p
m=0 n=0
∂Umn
As in the case of the fully connected feedforward neural network (Sect. 2.1),
∂E
the ∂U p that appears in the parameter update equation is given by the error back
mn
propagation calculation, where we need such values as ∂ Ep−1 of each layer;
∂ Omn
ΣΣ
M−1 N −1 p
∂E ∂E ∂Umn
= p ·
p−1
∂ Omn m=0 n=0
∂U mn ∂ Omn
p−1
(Σ )
S−1 ΣT −1 p−1 p−1 p
ΣΣ
M−1 N −1
∂E ∂ s=0 t=0 h st · Om+s,n+t + θmn
= p ·
m=0 n=0
∂Umn ∂ Omn
p−1
N −1
( S−1 T −1 )
ΣΣ
M−1
∂E Σ Σ p−1 ∂ Om+s,n+t p−1
= p · h st ·
m=0 n=0
∂Umn s=0 t=0
p−1
∂ Omn
N −1
( S−1 T −1 )
ΣΣ
M−1
∂E Σ Σ p−1
= p · h st · δm,m+s δn,n+t (2.2.9)
m=0 n=0
∂Umn s=0 t=0
Since M ≫ S and N ≫ T in general, δm,m+s δn,n+t is equal to 1 rarely in the above

equation. As an example, when (m, n) = (7, 8) and S = T = 3, the right-hand side
2.2 Convolutional Neural Network 63
of the above equation is the sum of the nine components of the following matrix:
⎛ ∂E p−1 ∂E p−1 ∂E p−1 ⎞
p
∂U5,6
· h 2,2 p
∂U6,6
· h 1,2 p
∂U7,6
· h 0,2
⎜ p−1 ⎟
⎜ ∂E
·
p−1 ∂ E
h 2,1 ∂U p ·
p−1 ∂ E
h 1,1 ∂U p · h 0,1 ⎟
⎝ p
∂U5,7 6,7 7,7 ⎠
∂E p−1 ∂ E p−1 ∂ E p−1
p
∂U5,8
· h 2,0 ∂U p · h 1,0 ∂U p · h 0,0
⎛ ⎞
6,8 7,8
∂E ∂E ∂E ⎛p−1 p−1 p−1

⎞
p p p
∂U5,6 ∂U6,6 ∂U7,6 h h h
⎜ ⎟ ⎜ 2,2 1,2 0,2
=⎜
∂E ∂E ∂E ⎟ ʘ ⎝ h p−1 h p−1 h p−1 ⎟
⎝ ⎠ 0,1 ⎠ (2.2.10)
p p p
∂U5,7 ∂U6,7 ∂U7,7 2,1 1,1
∂E ∂E ∂E p−1 p−1 p−1
p p p h 2,0 h 1,0 h 0,0
∂U5,8 ∂U6,8 ∂U7,8
Equation (2.2.10) shows that the back propagation calculation can be done by
convolution with the inverted filter matrix.
In the activation function layer, the activation function is activated by making the
output of the convolutional layer to be input. In the case of the ReLU function, which
is most commonly used with convolutional layers, the output of the activation layer
is calculated as
( p−1 )
p
Umn = ReLU Omn (2.2.11)
where there are no parameters to be trained in the activation function layer. The
derivative for the error back propagation is calculated as follows:
ΣΣ
M−1 N −1 p
∂E ∂E ∂Umn
= p ·
∂ Omn
p−1
m=0 n=0
∂Umn ∂ Omn
p−1
( )
p−1
ΣΣ
M−1 N −1
∂E ∂ReLU Omn
= p ·
m=0 n=0
∂U mn
p−1
∂ Omn
( )
p−1
∂E ∂ReLU Omn
= p ·
∂Umn p−1
∂ Omn
⎧ ( )
⎨ ∂ Ep O p−1 > 0
= ∂Umn ( p−1 )
mn
(2.2.12)
⎩ 0 Omn ≤ 0
Employing a pooling layer just after a convolution layer, the connections in the
pooling layer are defined as follows:
⎛ ⎞ g1
1 Σ ( )g
=⎝ ⎠
p p−1
Umn Oi j (2.2.13)
S×T (i, j)∈Dmn
p−1
where Oi j is the output of the (i, j)-th unit arranged in a two-dimensional manner
in the (p-1)-th layer, Dmn the pooling window of the (m, n)-th unit in the p-th layer
and (i, j) the index of the unit within the pooling window Dmn of SxT size. The
values of S and T are often set to the same as the filter size in the convolutional layer.
Setting g in Eq. (2.2.13) to 1.0 results in an average pooling as follows:
1 Σ p−1
p
Umn = Oi j (2.2.14)
S×T (i, j)∈Dmn
On the other hand, setting g in Eq. (2.2.13) to ∞ results in a max pooling as

{ }
p−1
p
Umn = max Oi j (2.2.15)
(i, j)∈Dmn
There are no parameters to be tuned in the pooling layer. The computation of the
derivative for the error back propagation is performed in the case of the average
pooling as
ΣΣ
M−1 N −1 p
∂E ∂E ∂Umn
= p ·
p−1
∂ Omn m=0 n=0
∂Umn ∂ Omnp−1
⎛ ⎞
ΣΣ N −1 Σ p−1
M−1
∂E ⎝ 1 ∂ Oi j
= p ·
⎠
m=0 n=0
∂U mn S×T (i, j)∈Dmn
p−1
∂ Omn
⎛ ⎞
ΣΣ
M−1 N −1 Σ
∂E ⎝ 1
= p · δmi δn j ⎠ (2.2.16)
m=0 n=0
∂U mn S×T (i, j)∈Dmn
Similarly, the computation of the derivative for the error back propagation is done in
the case of the max pooling as
ΣΣ
M−1 N −1 p
∂E ∂E ∂Umn
= p ·
∂ Omn
p−1
m=0 n=0
∂Umn ∂ Omn
p−1
ΣΣ
M−1 N −1 ( { })
∂E ∂ p−1
= p · max Oi j (2.2.17)
m=0 n=0
∂Umn ∂ Omn
p−1 (i, j)∈Dmn
A normalization layer is employed often in combination with the convolutional layer.

A typical normalization layer is defined as follows:
p−1 p−1
Omn − O mn
p
Umn = √ (2.2.18)
c + σmn
2
2.3 Training Acceleration 65
where
p−1 1 Σ p−1
O mn = Oi, j (2.2.19)
S×T (i. j)∈Dmn
1 Σ ( p−1
)
p−1 2
σmn
2
= Oi, j − O mn (2.2.20)
S×T (i, j)∈Dmn
p−1
Here, Omn is the output of the (m, n)-th unit in the (p-1)-th layer, where the units
are arranged in a two-dimensional manner, Dmn the normalization window of the
(m, n)-th unit in the p-th layer, (i, j) the index of the unit within the normalization
window Dmn of S×T size and c a small constant number to avoid the division by
zero.
2.3 Training Acceleration
In Sect. 2.3, we discuss several methods for accelerating the error back propagation
learning.
2.3.1 Momentum Method
Error back propagation algorithm based on the stochastic gradient decent method
(SGD) is known often very time consuming, and then, various attempts have been
made to accelerate its speed.
The momentum method [10] is one of the standard acceleration methods as it is
relatively easy to implement and is effective in many cases.
Let the connection weight in the t-th update be w (t)
ji , and then, the amount of
(t)
update of the connection weight Δw ji in the standard backpropagation algorithm is
written as follows:
∂ E(w, θ )
Δw (t)
ji = − (2.3.1)
∂w (t)
ji
w (t+1)
ji = w (t) (t)
ji + α · Δw ji (2.3.2)
where α is a learning coefficient.

In the momentum method, the amount of update is corrected by the amount of update
at the previous update. Specifically, when the amount of update in the momentum
method is written as Δ M w (t)

ji , Eqs. (2.3.1) and (2.3.2) are modified as follows:
∂ E(w, θ )
Δ M w (t)
ji = − + γ Δ M w (t−1) = Δw (t) M (t−1)
ji + γ Δ w ji (2.3.3)
∂w (t)
ji
ji
w (t+1)
ji = w (t) M (t)
ji + α · Δ w ji (2.3.4)
Here, γ is a positive constant. In this method, the update at the current step is corrected
with the past updates multiplied with powers of γ as follows:
w (t+1)
ji = w (t) M (t)
ji + α · Δ w ji
( )
= w (t) (t) M (t−1)
ji + α · Δw ji + γ Δ w ji
( ( ))
= w (t)
ji + α · Δw (t)
ji + γ Δw (t−1)
ji + γ Δ M (t−2)
w ji
( ( ( )))
= w ji + α · Δw ji + γ Δw ji + γ Δw ji + γ Δ M w (t−3)
(t) (t) (t−1) (t−2)
ji
( )
= w (t) (t)
ji + α · Δw ji + γ Δw ji
(t−1)
+ γ 2 Δw (t−2)
ji + γ 3 Δw (t−3)
ji + ···
(2.3.5)
The momentum method has the effect of accelerating the update by increasing the
amount of update when the current update is in the same direction as the previous
update and suppressing the vibration by decreasing the amount of update when the
direction of update is opposite.
2.3.2 AdaGrad and RMSProp
The AdaGrad method is regarded as an improved version of the momentum method

described in Sect. 2.3.1, and the RMSProp method is a further improved version of
the AdaGrad method.
While the momentum method (Sect. 2.3.1) uses a common learning coefficient
across parameters, the AdaGrad method [1] employs learning coefficients indepen-
dently for each parameter, where each parameter is automatically updated based on
the history of previous updates. If the connection weight in the t-th update is w (t)
ji and
(t)
its amount of update is Δw ji , the update of the connection weight in this method is
written as follows:
∂ E(w, θ )
Δw (t)
ji = − (2.3.6)
∂w (t)
ji
w (t+1)
ji = w (t) (t)
ji + α ji (t) · Δw ji (2.3.7)
2.3 Training Acceleration 67
where α ji (t) is the learning coefficient given by
γ
α ji (t) = √ (2.3.8)
∈ + S ji (t)
t (
Σ )2
S ji (t) = Δw (τji ) (2.3.9)
τ =1
Here, γ is a constant and ∈ a small constant to avoid division by zero and to improve
numerical stability.
In this method, the learning coefficients become smaller for parameters that have
been updated largely or more frequently, while they remain relatively large for other
parameters.
On the other hand, the RMSProp method [11] can be regarded as an improved
version of the AdaGrad method. In the RMSProp method, S ji (t) in Eq. (2.3.9) of the
AdaGrad method is modified so that the asymptotic equations given as
⎧ ( )2
⎪
⎨ S ji (1) = ρ Δw (1) (t = 1)
ji
⎪ ( )2
(2.3.10)
⎩ S ji (t) = ρ Δw (t) + (1 − ρ)S ji (t − 1) (t ≥ 2)
ji
are employed. Here, ρ is a positive constant smaller than 1.

Let S ji (t) in Eq. (2.3.9) and that in Eq. (2.3.10) be denoted by S jiA (t) and S jiR (t),
respectively, and then, they are expressed as follows:
( )2 ( )2 ( )2 ( )2
S jiA (t) = Δw (t)
ji + Δw (t−1)
ji + Δw (t−2)
ji + Δw (t−3)
ji + ···
(( )2 ( )2 ( )2 )
(t) (t−1) (t−2)
S ji (t) = ρ Δw ji + (1 − ρ) Δw ji
R
+ (1 − ρ) Δw ji
2
+ ···
(2.3.11)
It is seen from Eq. (2.3.11) that the AdaGrad method treats the updates of all
steps equally, while the RMSProp method emphasizes the recent updates. For many
problems, the RMSProp method is known more effective than the AdaGrad method.
2.3.3 Adam
Finally, the Adam method [6], which is probably the most commonly used acceler-
ation method today, is discussed. Here again, a comparison with other methods is
given to make it easier to understand its features.
This method is known to be another learning acceleration method that changes

independently the learning coefficients for each parameter. The update rule for the
connection weights in the method is written as follows:
∂ E(w, θ )
Δw (t)
ji = − (2.3.12)
∂w (t)
ji
w (t+1)
ji = w (t)
ji + α ji (t) · Δ
Adam (t)
w ji (2.3.13)
where the amount of update ΔAdam w (t)

ji is given as
m (0) = 0 (2.3.14)
m (t) = β1 · m (t−1) + (1 − β1 )Δw (t)

ji (2.3.15)
m (t)
ΔAdam w (t)
ji = (2.3.16)
1 − β1t
where β1 is a constant. From Eqs. (2.3.15) and (2.3.16), ΔAdam w (t)

ji is written as
1 − β1 ( )
ΔAdam w (t)
ji =
(t) (t−1)
t Δw ji + β1 Δw ji + β12 Δw (t−2) + β13 Δw (t−3) + ···
1 − β1 ji ji
(2.3.17)
This equation is similar to Eq. (2.3.5) of the momentum method (Sect. 2.3.1),
suggesting that the Adam method is also an improved version of the momentum
method.
On the other hand, the learning coefficient α ji (t) in Eq. (2.3.13) is given by
v (0) = 0 (2.3.18)
( )2
v (t) = β2 · v (t−1) + (1 − β2 ) Δw (t)
ji (2.3.19)
v (t)
v̂ (t) = (2.3.20)
1 − β2t
γ
α ji (t) = √ (2.3.21)
ε+ v̂ (t)
2.4 Regularization 69
where γ , β1 , and β2 are parameters, and ε is a parameter for avoiding numerical

instability. Note that γ = 0.001, β1 = 0.9, β2 = 0.999, and ε = 1.0 × 10−8 are
suggested in the original paper.
Equation (2.3.19) is similar to Eq. (2.3.10). Thus, the Adam method, employing
the similar update rule as that of the RMSProp method (Sect. 2.3.2), can be considered
to be a method that incorporates the momentum method in the latter method. The
Adam method is considered to be effective in a variety of problems and has become
widely used.
2.4 Regularization
We study here some methods to stabilize learning and prevent overtraining, which
are usually called regularization methods. Section 2.4.1 explains the meaning of
regularization in the context of inverse problems, and Sects. 2.4.2 and 2.4.3 describe
representative regularization methods for the error back propagation algorithm.
2.4.1 What Is Regularization?
A problem in which the cause is input and its result is obtained as output is called a
direct problem, while a problem in which the cause is to be inferred from the result
is called an inverse problem [7]. Let us consider a collision between two cars. The
direct problem is to estimate the deformation and damage using relative positions
of cars, directions of travel of them, speed at the time of collision, and so on as
inputs, while the inverse problem is to estimate the positional relationship and speed
at the time of collision from the deformation and damage after the collision of the
cars. Non-destructive evaluation such as defect identification is also a typical inverse
problem. It is known that inverse problems are much more difficult to be solved than
direct problems.
A typical inverse problem is defined as a problem of estimating the underlying
function yi = f (xi ) from n observed data, (x1 , y1 ), (x2 , y2 ), · · · , (xn , yn ). To solve
this problem, it is usually performed to find the function f opt among various f () that
minimizes the sum of the squared error defined as
1Σ
n
ES( f ) = ( f (xi ) − yi )2 (2.4.1)
2 i=1
It is well known, however, that the search for the function fopt often fails. Figure 2.5
shows such a case, where f A in Fig. 2.5a reproduces the sample points well, while
f B in Fig. 2.5b has errors at each sample point. As for the sum of squared errors
defined in Eq. (2.4.1), it is clear that the error is bigger in f B than in f A or
y y
x x
a a
Fig. 2.5 Overfitting
E( f A ) < E( f B ) (2.4.2)
suggesting that we should select f A as f opt .

However, if we compare, for example, the values f A (a) and f B (a) at x = a
in the figures, the point (a, f A (a)) is far from other sample points, while the point
(a, f B (a)) is close to other sample points, and majority may find (a, f B (a)) more
plausible. Thus, it is concluded that f A is overfitting only at the sample points.
To solve the above issue, Tikhonov and Arsenin studied to add a new term to
Eq. (2.4.1) to suppress overfitting to sample points, which he called regularization
[12]. In other words, in Tikhonov’s regularization, instead of minimizing E S ( f ) in
Eq. (2.4.1), E T ( f ) given as
E T ( f ) = E S ( f ) + λE R ( f ) (2.4.3)
is to be minimized, where E R ( f ) is the regularization term and λ the regularization

parameter (λ ≥ 0). λ = 0 results in Eq. (2.4.1) without regularization, and if λ is too
large, minimization of E S ( f ) becomes insufficient. As a sample of the regularization
term, E R ( f ) given as
1
ER( f ) = || D f 2 || (2.4.4)
2
2
is often used, where D is a differential operator and the squared norm. As an
example, we assume
∥ 2∥
1∥ df ∥
ER( f ) = ∥ ∥ (2.4.5)
∥
2 dx ∥
Adding this term in Eq. (2.4.3), the relation as

2.4 Regularization 71
ET ( f A) > ET ( f B ) (2.4.6)
may hold for an appropriate value of λ and f B can be selected as the f opt .
The same may happen in the error back propagation learning in neural networks
to minimize the error defined as
1Σ
n L
( p L p )2
E= O j − Tj (2.4.7)
2 j=1
When trying to minimize Eq. (2.4.7) with the small number of training patterns, it
is possible to overfit the training patterns, resulting in the reduction of squared error
for training patterns, but the increase of error for verification patterns. This is called
overtraining.
By adding the regularization term E R to Eq. (2.4.7) as
1Σ
n L
( p L p )2
E= O j − T j + λE R (2.4.8)
2 j=1
it is shown that overtraining is suppressed and the estimation accuracy for patterns
that are not used for training is improved; in other words, the generalization capability
is improved.
2.4.2 Weight Decay
The regularization method takes the sum of the squares of all the connection weights
as the regularization term is one of the most famous methods for neural networks,
where the error function to be minimized is given as follows:
1Σ 1 Σ Σ Σ( l )2
nL
( p L p )2
E T = E S + λE R = O j − Tj + λ w ji (2.4.9)
2 j=1 2 l j i
where wlji is the connection weight between the j-th unit in the (l + 1)-th layer and the
i-th unit in the l-th layer. The amount of update of wab
c
in the error back propagation
learning is written as
∂ ET ∂ ES ∂ ER
ΔT wab
c
=− c =− c −λ c = Δ S wab − λwab
c c
(2.4.10)
∂wab ∂wab ∂wab
Here, ΔT wab c
is the amount of update of wabc
including the regularization term, and
Δ S wab is that when there is no regularization term. If the learning coefficient is set
c
to be α, the update rule for the connection weight wabc

is written as
( )
wab
c
← wab
c
+ αΔT wab
c
= wab
c
+ α Δ S wab
c
− λwab
c
= (1 − αλ)wab
c
+ αΔ S wab
c
(2.4.11)
where the second term of the right-hand side of Eq. (2.4.11) is the same as the
correction when there is no regularization term. The first term of the right-hand side
of Eq. (2.4.11) is the term due to regularization. Since 0 < 1 − αλ < 1 for most
cases, it usually has the effect of reducing the absolute value of the connection weight
in every training epoch. Therefore, the regularization of Eq. (2.4.9) is called Weight
Decay.
2.4.3 Physics-Informed Network
A regularization method that employs physical information (governing equations,

boundary conditions, etc.) as the regularization term has received much attention in
computational mechanics, called the physics-informed neural network [5, 9].
Let us find a solution of a nonlinear differential equation as
∂u(x, t)
+ N [u] = 0, x ∈ Ω, t ∈ [0, T ] (2.4.12)
∂t
where u(x, t) is the unknown, N [] the nonlinear differential operator and Ω a subset
of R D .
Assuming a data-driven solution of u(x, t) by the neural network, we usually
need to minimize the error function E D as follows:
1 Σ ⎟ ( i i) ⎟
nu
ED = ⎟u x , t − u i ⎟2 (2.4.13)
u u
n u i=1
( )
where x iu , tui , u i is the training data of u(x, t) including initial and boundary training
data and n u the number of training data. Minimization of E D is a method that has
been widely used to obtain an approximate solution using neural networks.
In the physics-informed neural network, a new loss term E P is added to E D ,
where E P is constructed based on the physical laws (partial differential equations or
governing equations) that u(x, t) should satisfy. In case of Eq. (2.4.12), the new loss
function is given as follows:
References 73
⎟ ( ) ⎟2
n f ⎟ ∂u x j , t j ⎡ ( )⏋⎟⎟
λ Σ ⎟ f f
E = E D + λE P = E D + ⎟ + N u x f , t f ⎟⎟
j j
(2.4.14)
n f j=1 ⎟⎟ ∂t ⎟
( )
j j
where x f , t f is the collocation point, n f the number of collocation points and λ
a weight to balance the two loss terms. ( )
To calculate E P , we need derivative values of the output u x iu , tui of a feed-
forward neural network with respect to its input data. These derivatives can be
obtained by automatic differentiation (see Sect. 1.3.8). SciANN [3], a package for
physics-informed neural networks based on a deep learning library with automatic
differentiation, is also available.
References
1. Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic
optimization. J. Mach. Learn. Res. 12, 2121–2159 (2011).
3. Haghighat, E., Juanes, R.: SciANN: A Keras/TensorFlow wrapper for scientific computations
and physics-informed deep learning using artificial neural networks. Comput. Methods Appl.
Mech. Eng. 373, 113552 (2021), https://doi.org/10.1016/j.cma.2020.113552
5. Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed
machine learning. Nature Rev. Phys. 3, 422–440 (2021). https://doi.org/10.1038/s42254-021-
00314-5
6. Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. in the 3rd International
Conference for Learning Representations (ICLR), San Diego, 2015, arXiv:1412.6980
7. Kubo, S.: Inverse problems related to the mechanics and fracture of solids and structures. JSME
Int. J. 31(2), 157–166 (1988)
8. LeCun, Y.: Generalization and network design strategies. Technical Report CRG-TR-89–4,
Department of Computer Science, University of Toronto (1989)
9. Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep
learning framework for solving forward and inverse problems involving nonlinear partial
differential equations. J. Comput. Phys. 378, 686–707 (2019)
10. Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by back-propagating
errors. Nature 323, 533–536 (1986)
11. Tieleman, T., Hinton, G.: Lecture 6.5-rmsprop: Divide the gradient by a running average of its
recent magnitude. COURSERA: Neural networks for machine learning 4(2), 26–31 (2012)
12. Tikhonov, A.N., Arsenin, V.Y.: Solution of Ill-posed Problems. John Wiley & Sons (1977)
Chapter 3
Computational Mechanics with Deep
Learning
Abstract The present chapter overviews recent research trends of deep learning
related to computational mechanics. In Sect. 3.1, we see the growing interest in deep
learning in recent years based on the trend of the number of published papers on
this topic, discussing how deep learning is applied to various fields in computational
mechanics. In Sect. 3.2, we review the research trends from the list of papers on
computational mechanics with deep learning published since 2018.
3.1 Overview
Various papers on feedforward neural networks and deep learning have been reported
in the field of computational mechanics, including material constitutive equations
[2, 3, 7], elemental integration of the finite element method [9], acceleration and
accuracy improvement of the finite element method [6, 11], contact analysis [5, 10],
non-destructive testing [12–14], and structural identification [1, 18]. These studies
in the field of computational mechanics are overviewed in [11, 15, 16].
The book published in 2021 [17] has categorized the above studies as follows:
• Constitutive Models
• Numerical Quadrature
• Identifications of Analysis Parameters
• Solvers and Solution Methods
• Structural Identification
• Structural Optimization.
Here, the category of constitutive models includes modeling of nonlinear and
history-dependent materials, that of numerical quadrature includes optimization of
elemental integration, and that of identification of analysis parameters includes opti-
mization of time increment in dynamic analysis. And the category of solvers and
solution methods includes applications of neural networks to no-reflection bound-
aries and domain decomposition as well as contact search, while the category of
structural identification includes defect identification and structural identification,

https://doi.org/10.1007/978-3-031-11847-0_3
76 3 Computational Mechanics with Deep Learning
Fig. 3.1 Number of articles 2020

on computational mechanics 2017
with neural network and/or
deep learning 2014
2011
2008
Year
2005
2002
1999
1996 Deep Learning
Neural Network
1993
0 10 20 30 40 50
Number of Articles
and that of structural optimization includes applications to various structural opti-

mization problems. As seen from the above, neural networks and deep learning have
been applied to almost all the fields and processes in computational mechanics.
Figure 3.1 shows the trend of the numbers of papers that include the word “neural
network” or “deep learning” in the title, where the total numbers are those of relevant
papers published since 1993 to 2020 in the journals as follows:
• International Journal for Numerical Methods in Engineering (IJNME)
• Computer Methods in Applied Mechanics and Engineering (CMAME)
• Finite Elements in Analysis and Design (FEAD)
• Computational Mechanics (CM)
• Computers & Structures (C&S).
Note that the total numbers of papers with the title “neural network” or “deep
learning” have increased rapidly since 2017, especially that of “ deep learning “ is
to be remarked.
As mentioned above, deep learning and neural networks have been applied to
various fields of computational mechanics, all of which attempt to reproduce the
input–output relationship or causal relationship in some process of computational
mechanics on a neural network.
Let us consider the non-destructive testing (defect identification). The response
at the observation point can be calculated by performing numerical simulation for
solid with some defect using the finite element method, where input and output are
set as follows:
input (x): Location and size of a defect
output ( y): Response at observation points.
In ordinary numerical simulations, the input (x) is the cause and the output ( y) is the
result, where the problem of finding the result ( y) for the cause (x) is called a direct
problem. On the other hand, the problem of finding the cause (x) for the result ( y)
is called an inverse problem. These relations are, respectively, written as follows:
3.1 Overview 77
Direct problem f : x → y (3.1.1)
Inverse problem f −1 : y → x (3.1.2)
While an ordinary numerical solution method for partial differential equations is

a tool for solving direct problems, the deep learning and the neural networks can be
applied to both direct and inverse problems.
In the case of non-destructive testing, the deep learning and neural networks are
usually employed to solve inverse problem, where a neural network is constructed
so as to take the response of the observation point as input and outputs the position
and shape of the defect.
On the other hand, the deep learning and neural networks are used as a tool to
solve a direct problem for the response at the observation point as output, which is
called a surrogate model, used as a substitute to increase analysis speed in ordinary
numerical simulations. In non-destructive testing, a surrogate model based on neural
networks is often used in conjunction with global optimization tools such as genetic
algorithms [8] to speed up analyses of direct problems that are to be performed many
times in an optimization loop.
The procedure of deep learning and neural networks for both direct and inverse
analyses consists of the three phases as follows (Fig. 3.2):
(1) Data Preparation Phase: Set up an input–output relationship needed to find

and construct a rule in the (field of computational) mechanics and so on. Let n-
p p p p
dimensional input data be x1 , x2 , . .(. , xn−1 , xn , p = 1, 2, 3, · · · , and corre-
p p p p)
sponding m-dimensional output data y1 , y2 , . . . , ym−1 , ym , p = 1, 2, 3, · · · ,
with p the number of input/output data pairs, where each component of the data
is a real number (including integers). A large number of input–output data pairs
above are collected for use in constructing a mapping in the following phase.
Fig. 3.2 Three phases in

application of neural Data Preparation Phase
networks to computational
Training patterns are collected.
mechanics
Training Phase
Neural networks are trained
using the collected patterns.
Application Phase
Trained neural networks are used
in target applications.
(2) Training Phase: Training of the neural network with deep learning is performed
to acquire mapping relations using the input/output data collected above. When
the trained neural network is to be used as a tool for solving a direct problem,
it is suggested to set its input and output data as follows:
( p p p )
Input data for deep learninng: x p = x1 , x2 , . . . , xn−1 , xnp
( p p p )
Output data for deep learning: y p = y1 , y2 , . . . , ym−1 , ymp
In this case, the deep learning constructs a multidimensional map y p =

f (x p ).as
( )
Direct problem f : R n → R m , i.e., y p = f x p (3.1.3)
On the other hand, when the trained neural network is to be used as a tool for
solving an inverse problem, it is suggested to set its input/output data as follows:
( p p p )
Input data for deep learning: y p = y1 , y2 , . . . , ym−1 , ymp
( p p p )
Output data for deep learning: x p = x1 , x2 , . . . , xn−1 , xnp
In this case, deep learning constructs a multidimensional map x p = f −1 ( y p )

as follows:
( )
Inverse problemk f −1 : R m → R n , i.e., x p = f −1 y p (3.1.4)
(3) Application Phase: By inputting new data to the trained neural network, the
estimated data are output based on the mapping relation constructed in the
above Training Phase. Even when input data are new or independent from those
employed in the Data Preparation Phase, the trained neural network outputs
appropriate data based on its generalization capability.
In the Data Preparation Phase, it is efficient to collect a large amount of input and
output data through computational mechanics simulations. Although experimental
data may work in this case, it is not always suitable as deep learning requires a large
amount of data. In general, it often leads to the degradation of the mapping relation
with insufficient training patterns, resulting in the inaccuracy of the learned mapping
or even the failure in learning the mapping relation.
3.2 Recent Papers on Computational Mechanics with Deep Learning 79
Though both the Data Preparation and the Training Phases need a huge amount
of computation, these are independently performed before the Application Phase.
For example, the computation time for the Training Phase can often be significantly
reduced by using GPUs and other accelerators suitable for deep learning.
It is important to note that more computer time is required for inference as the
number of hidden units and layers is increased to improve the estimation accuracy,
and special arithmetic units for inference are available to solve this issue.
When a neural network already trained for some target problem is applied to
other problem, it is often effective to retrain the network with new input and output
data. This is because the neural network trained once for a target problem can often
be retrained much more quickly for other similar problem, which is called domain
adaptation or transfer learning [4].
3.2 Recent Papers on Computational Mechanics with Deep

Learning
In this section, we survey the latest papers related to deep learning in the field of
computational mechanics to explore the research trends in this area.
Compiled at the end of this chapter is a list of almost 140 papers [19–157]
related to neural networks and deep learning published in five journals (IJNME,
C&S, CMAME, FEAD, and CM: see Sect. 3.1) since 2018.
Table 3.1 summarizes seven papers of generative networks, such as the generative
adversarial networks and the variational autoencoder among the 140 papers above.
The first column of the table shows the year of publication, the second column the
journal title, the third column the neural network structure mainly used (C: convo-
lutional neural network, F: Fully connected feedforward neural network, and R:
Recurrent neural network), and the fourth column the title of the paper.
Table 3.2 summarizes papers on convolutional neural networks. It is interesting
to see that Tables 3.1 and 3.2 show that convolutional neural networks are used in
about 20% of the 140 papers listed above.
Table 3.3 summarizes 15 papers related to physics-informed networks, which
have been increasingly applied to computational mechanics in recent years, mainly
using fully connected feedforward neural networks.
In summary, a variety of new technologies that have emerged in recent years in
the field of deep learning have been rapidly adopted to the field of computational
mechanics, and the scope of deep learning applied to computational mechanics is
expanding further.
Table 3.1 Examples of computational mechanics researches with generative networks

Year Journal NN Title
2020 CMAME C Data-driven modelling of nonlinear spatio-temporal fluid flows using a
deep convolutional generative adversarial network [32]
2020 CMAME C An advanced hybrid deep adversarial autoencoder for parameterized
nonlinear fluid flow modelling [33]
2020 CMAME C An end-to-end three-dimensional reconstruction framework of porous
media from a single two-dimensional image based on deep learning [43]
2020 CMAME C Towards blending Physics-Based numerical simulations and seismic
databases using Generative Adversarial Network [50]
2020 CMAME C Deep generative modeling for mechanistic-based learning and design of
metamaterial systems [129]
2019 CM F Conditional deep surrogate models for stochastic, high-dimensional, and
multi-fidelity systems [145]
2019 CM C Solving Bayesian inverse problems from the perspective of deep
generative networks [59]
Note
CMAME: Computational Methods in Applied Mechanics and Engineering
CM: Computational Mechanics
Table 3.2 Examples of computational mechanics researches with convolutional networks

2021 CMAME C A novel deep learning-based modelling strategy from image of
particles to mechanical properties for granular materials with CNN
and BiLSTM [150]
2021 CMAME C DiscretizationNet: A machine-learning based solver for Navier–Stokes
equations using finite volume discretization [102]
2021 CMAME C Deep-learning-based surrogate flow modeling and geological
parameterization for data assimilation in 3D subsurface flow [115]
2021 CMAME C Nonlocal multicontinua with representative volume elements.
Bridging separable and non-separable scales [35]
2020 CMAME C Geometric deep learning for computational mechanics Part I:
anisotropic hyperelasticity [121]
2020 CMAME C Data-driven reduced order model with temporal convolutional neural
network [139]
2020 CMAME C Designing phononic crystal with anticipated band gap through a deep
learning based data-driven method [78]
2020 CMAME C An intelligent nonlinear meta element for elastoplastic continua: deep
learning using a new Time-distributed Residual U-Net architecture
[72]
2020 CMAME C Attention-based convolutional autoencoders for 3D-Variational data
assimilation [87]
2020 CMAME C Surrogate permeability modelling of low-permeable rocks using
convolutional neural networks [119]
(continued)
3.2 Recent Papers on Computational Mechanics with Deep Learning 81
Table 3.2 (continued)

2020 CMAME C Multi-level convolutional autoencoder networks for parametric
prediction of spatio-temporal dynamics [141]
2020 CMAME C FEA-Net: A physics-guided data-driven model for efficient
mechanical response prediction [146]
2020 CMAME C Development of an algorithm for reconstruction of droplet history
based on deposition pattern using computational fluid dynamics and
convolutional neural network [148]
2020 CMAME C Machine learning materials physics: Multi-resolution neural networks
learn the free energy and nonlinear elastic response of evolving
microstructures [154]
2020 C&S C Topology optimization of 2D structures with nonlinearities using deep
learning [20]
2020 CM C Microstructural inelastic fingerprints and data-rich predictions of
plasticity and damage in solids [95]
2019 CM C Prediction of aerodynamic flow fields using convolutional neural
networks [28]
2019 IJNME C Deep convolutional neural networks for eigenvalue problems in
mechanics [46]
2019 CMAME C A deep learning-based hybrid approach for the solution of
multiphysics problems in electrosurgery [56]
2019 CMAME C Predicting the effective mechanical property of heterogeneous
materials by image based modeling and deep learning [77]
2019 CMAME C Circumventing the solution of inverse problems in mechanics through
deep learning: Application to elasticity imaging [97]
Note
IJNME: International Journal for Numerical Methods in Engineering
C&S: Computers and Structures
Table 3.3 Examples of computational mechanics researches with physics-informed networks

2021 CMAME F Prediction and identification of physical systems by means of
Physically-guided neural networks with meaningful internal layers [24]
2021 CMAME F A physics-informed deep learning framework for inversion and
surrogate modeling in solid mechanics [53]
2021 CMAME F A physics-informed operator regression framework for extracting
data-driven continuum models [98]
2021 CMAME F Efficient uncertainty quantification for dynamic subsurface flow with
surrogate by Theory-guided Neural Network [132]
(continued)
Table 3.3 (continued)

2021 CMAME F Non-invasive inference of thrombus material properties with
physics-informed neural networks [147]
2021 CMAME F Machine learning for metal additive manufacturing: predicting
temperature and melt pool fluid dynamics using physics-informed
neural networks [157]
2021 C&S R Estimating model inadequacy in ordinary differential equations with
2020 CMAME F Conservative physics-informed neural networks on discrete domains
for conservation laws: Applications to forward and inverse problems
[62]
2020 CMAME F Physics-informed neural networks for high-speed flows [88]
2020 CMAME F PPINN: Parareal physics-informed neural network for time-dependent
PDEs [90]
2020 CMAME F Physics-informed multi-LSTM networks for metamodeling of
nonlinear structures [152]
2020 CMAME F The neural particle method—An updated Lagrangian physics informed
neural network for computational fluid dynamics [135]
2020 CMAME F Machine learning in cardiovascular flows modeling: Predicting arterial
blood pressure from non-invasive 4D flow MRI data using
2020 CMAME F Surrogate modeling for fluid flows based on physics-constrained deep
learning without simulation data [112]
2019 CM F General solutions for nonlinear differential equations: a rule-based
self-learning approach using deep reinforcement learning [134]
Note
C&S: Computers and Structures
References
1. Facchini, L., Betti, M., Biagini, P.: Neural network based modal identification of structural
systems through output-only measurement. Comput. Struct. 138, 183–194 (2014)
2. Furukawa, T., Yagawa, G.: Implicit constitutive modelling for viscoplasticity using neural
networks. Int. J. Numer. Meth. Eng. 43, 195–219 (1998)
3. Ghaboussi, J., Pecknold, D.A., Zhang, M., Haj-Ali, R.: Autoprogressive training of neural
network constitutive models. Int. J. Numer. Meth. Eng. 42, 105–126 (1998)
5. Hattori, G., Serpa, A.L.: Contact stiffness estimation in ANSYS using simplified models and
artificial neural networks. Finite Elem. Anal. Des. 97, 43–53 (2015)
6. Kim, J.H., Kim, Y.H.: A predictor-corrector method for structural nonlinear analysis. Comput.
Methods Appl. Mech. Eng. 191, 959–974 (2001)
7. Lefik, M., Schrefler, B.A.: Artificial neural network as an incremental non-linear constitutive
model for a finite element code. Comput. Methods Appl. Mech. Eng. 192, 3265–3283 (2003)
References 83

Verlag (1992)
9. Oishi, A., Yagawa, G.: Computational mechanics enhanced by deep learning. Comput.
10. Oishi, A., Yagawa, G.: A surface-to-surface contact search method enhanced by deep learning.
Comput. Mech. 65, 1125–1147 (2020)
11. Oishi, A., Yagawa, G.: Finite elements using neural networks and a posteriori error. Arch.
Comput. Methods Eng. 28, 3433-3456 (2021). https://doi.org/10.1007/s11831-020-09507-0
12. Oishi, A., Yamada, K., Yoshimura, S., Yagawa, G.: Quantitative nondestructive evaluation
with ultrasonic method using neural networks and computational mechanics. Comput. Mech.
15, 521–533 (1995)
13. Oishi, A., Yamada, K., Yoshimura, S., Yagawa, G., Nagai, S., Matsuda, Y.: Neural network-
based inverse analysis for defect identification with laser ultrasonics. Res. Nondestruct. Eval.
13(2), 79–95 (2001)
14. Stavroulakis, G.E., Antes, H.: Nondestructive elastostatic identification of unilateral cracks
through BEM and neural networks. Comput. Mech. 20, 439–451 (1997)
15. Waszczyszyn, Z., Ziemianski, L.: Neural networks in mechanics of structures and materials
- new results and prospects of applications. Comput. Struct. 79, 2261–2276 (2001)
16. Yagawa, G., Okuda, H.: Neural networks in computational mechanics. Arch. Comput.
Methods Eng. 3(4), 435–512 (1996)
17. Yagawa, G., Oishi, A.: Computational Mechanics with Neural Networks. Springer (2021)
18. Yoshimura, S., Matsuda, A., Yagawa, G.: New regularization by transformation for neural
network based inverse analyses and its application to structure identification. Int. J. Numer.
Meth. Eng. 39, 3953-396 (1996)
19. Abbas, T., Kavrakov, I., Morgenthal, G., Lahmer, T.: Prediction of aeroelastic response of
bridge decks using artificial neural networks. Comput. Struct. 231, 106198 (2020). https://
doi.org/10.1016/j.compstruc.2020.106198
20. Abueidda, D.W., Koric, S., Sobh, N.A.: Topology optimization of 2D structures with nonlin-
earities using deep learning. Comput. Struct. 237, 106283 (2020). https://doi.org/10.1016/j.
compstruc.2020.106283
21. Angeli, A., Desmet, W., Naets, F.: Deep learning for model order reduction of multibody
systems to minimal coordinates. Comput. Methods Appl. Mech. Eng. 373, 113517 (2021).
https://doi.org/10.1016/j.cma.2020.113517
22. Asaadi, E., Heyns, P.S., Haftka, R.T., Tootkaboni, M.: On the value of test data for reducing
uncertainty in material models: Computational framework and application to spherical inden-
tation. Comput. Methods Appl. Mech. Eng. 346, 513–529 (2019). https://doi.org/10.1016/j.
cma.2018.11.021
23. Avery, P., Huang, D.Z., He, W., Ehlers, J., Derkevorkian, A., Farhat, C.: A computationally
tractable framework for nonlinear dynamic multiscale modeling of membrane woven fabrics.
Int. J. Numer. Methods Eng. 122, 2598–2625 (2021). https://doi.org/10.1002/nme.6634
24. Ayensa-Jiménez, J., Doweidar, M.H., Sanz-Herrera, J.A., Doblaré, M.: Prediction and identifi-
cation of physical systems by means of Physically-Guided Neural Networks with meaningful
internal layers. Comput. Methods Appl. Mech. Eng. 381, 113816 (2021). https://doi.org/10.
1016/j.cma.2021.113816
25. Bacigalupo, A., Gnecco, G., Lepidi, M., Gambarotta, L.: Computational design of innovative
mechanical metafilters via adaptive surrogate-based optimization. Comput. Methods Appl.
Mech. Eng. 375, 113623 (2021). https://doi.org/10.1016/j.cma.2020.113623
26. Baiges, J., Codina, R., Castañar, I., Castillo, E.: A finite element reduced-order model based
on adaptive mesh refinement and artificial neural networks. Int. J. Numer. Methods Eng. 121,
588–601 (2020). https://doi.org/10.1002/nme.6235
27. Balokas, G., Kriegesmann, B., Czichon, S., Rolfes, R.: A variable-fidelity hybrid surro-
gate approach for quantifying uncertainties in the nonlinear response of braided composites.
Comput. Methods Appl. Mech. Eng. 381, 113851 (2021). https://doi.org/10.1016/j.cma.2021.
113851
28. Bhatnagar, S., Afshar, Y., Pan, S., Duraisamy, K., Kaushik, S.: Prediction of aerodynamic
flow fields using convolutional neural networks. Comput. Mech. 64, 525–545 (2019). https://
doi.org/10.1007/s00466-019-01740-0
29. Bhattacharjee, S., Matouš, K.: A nonlinear data-driven reduced order model for computational
homogenization with physics/pattern-guided sampling. Comput. Methods Appl. Mech. Eng.
359, 112657 (2020). https://doi.org/10.1016/j.cma.2019.112657
30. Chen, G.: Recurrent neural networks (RNNs) learn the constitutive law of viscoelasticity.
Comput. Mech. 67, 1009–1019 (2021). https://doi.org/10.1007/s00466-021-01981-y
31. Chen, G., Li, T., Chen, Q., Ren, S., Wang, C., Li, S.: Application of deep learning neural
network to identify collision load conditions based on permanent plastic deformation of shell
structures. Comput. Mech. 64, 435–449 (2019). https://doi.org/10.1007/s00466-019-01706-2
32. Cheng, M., Fang, F., Pain, C.C., Navon, I.M.: Data-driven modelling of nonlinear spatio-
temporal fluid flows using a deep convolutional generative adversarial network. Comput.
Methods Appl. Mech. Eng. 365, 113000 (2020). https://doi.org/10.1016/j.cma.2020.113000
33. Cheng, M., Fang, F., Pain, C.C., Navon, I.M.: An advanced hybrid deep adversarial autoen-
coder for parameterized nonlinear fluid flow modelling. Comput. Methods Appl. Mech. Eng.
34. Chi, H., Zhang, Y., Tang, T.L.E., Mirabella, L., Dalloro, L., Song, L., Paulino, G.H.: Universal
machine learning for topology optimization. Comput. Methods Appl. Mech. Eng. 375, 112739
(2021). https://doi.org/10.1016/j.cma.2019.112739
35. Chung, E.T., Efendiev, Y., Leung, W.T., Vasilyeva, M.: Nonlocal multicontinua with repre-
sentative volume elements. Bridging separable and non-separable scales. Comput. Methods
Appl. Mech. Eng. 377, 113687 (2021). https://doi.org/10.1016/j.cma.2021.113687
36. Chung, I., Im, S., Cho, M.: A neural network constitutive model for hyperelasticity based on
molecular dynamics simulations. Int. J. Numer. Methods Eng. 122, 5–24 (2021). https://doi.
org/10.1002/nme.6459
37. Dehghani, H., Zilian, A.: Poroelastic model parameter identification using artificial neural
networks: on the effects of heterogeneous porosity and solid matrix Poisson ratio. Comput.
Mech. 66, 625–649 (2020). https://doi.org/10.1007/s00466-020-01868-4
38. Dehghani, H., Zilian, A.: ANN-aided incremental multiscale-remodelling-based finite strain
poroelasticity. Comput. Mech. 68, 131–154 (2021). https://doi.org/10.1007/s00466-021-020
23-3
39. Deng, H., To, A.C.: Topology optimization based on deep representation learning (DRL) for
compliance and stress-constrained design. Comput. Mech. 66, 449–469 (2020). https://doi.
org/10.1007/s00466-020-01859-5
40. Deng, H., To, A.C.: Reverse shape compensation via a gradient-based moving particle opti-
mization method. Comput. Methods Appl. Mech. Eng. 377, 113658 (2021). https://doi.org/
10.1016/j.cma.2020.113658
41. Dong, H., Nie, Y., Cui, J., Kou, W., Zou, M., Han, J., Guan, X., Yang, Z.: A wavelet-based
learning approach assisted multiscale analysis for estimating the effective thermal conduc-
tivities of particulate composites. Comput. Methods Appl. Mech. Eng. 374, 113591 (2021).
42. Duan, W., Ma, X., Huang, L., Liu, Y., Duan, S.: Phase-resolved wave prediction model for
long-crest waves based on machine learning. Comput. Methods Appl. Mech. Eng. 372, 113350
43. Feng, J., Teng, Q., Li, B., He, X., Chen, H., Li, Y.: An end-to-end three-dimensional recon-
struction framework of porous media from a single two-dimensional image based on deep
learning. Comput. Methods Appl. Mech. Eng., 368, 113043 (2020). https://doi.org/10.1016/
j.cma.2020.113043
44. Feng, S.Z., Han, X., Ma, Z.J., Królczyk, G., Li, Z.X.: Data-driven algorithm for real-time
fatigue life prediction of structures with stochastic parameters. Comput. Methods Appl. Mech.
Eng. 372, 113373 (2020). https://doi.org/10.1016/j.cma.2020.113373
45. Fernández, M., Jamshidian, M., Böhlke, T., Kersting, K., Weeger, O.: Anisotropic hypere-
lastic constitutive models for finite deformations combining material theory and data-driven
References 85
approaches with application to cubic lattice metamaterials. Comput. Mech. 67, 653–677
(2021). https://doi.org/10.1007/s00466-020-01954-7
46. D Finol Y Lu V Mahadevan A Srivastava 2019 Deep convolutional neural networks for
eigenvalue problems in mechanics Int. J. Numer. Methods Eng. 118 258 275 https://doi.org/
10.1002/nme.6012
47. Freno, B.A., Carlberg, K.T.: Machine-learning error models for approximate solutions to
parameterized systems of nonlinear equations. Comput. Methods Appl. Mech. Eng., 348,
250-296 (2019). https://doi.org/10.1016/j.cma.2019.01.024
48. Fu, J., Cui, S., Cen, S., Li, C.: Statistical characterization and reconstruction of heterogeneous
microstructures using deep neural network. Comput. Methods Appl. Mech. Eng. 373, 113516
49. Fuchs, A., Heider, Y., Wang, K., Sun, W., Kaliske, M.: DNN2: A hyper-parameter rein-
forcement learning game for self-design of neural network based elasto-plastic constitutive
descriptions. Comput. Struct. 249, 106505 (2021). https://doi.org/10.1016/j.compstruc.2021.
106505
50. Gatti, F., Clouteau, D.: Towards blending Physics-Based numerical simulations and seismic
databases using Generative Adversarial Network. Comput. Methods Appl. Mech. Eng., 372,
113421 (2020). https://doi.org/10.1016/j.cma.2020.113421
51. Ghavamian, F., Simone, A.: Accelerating multiscale finite element simulations of history-
dependent materials using a recurrent neural network. Comput. Methods Appl. Mech. Eng.
52. Haghighat, E., Juanes, R.: SciANN: A Keras/TensorFlow wrapper for scientific computations
and physics-informed deep learning using artificial neural networks. Comput. Methods Appl.
53. Haghighat, E., Raissi, M., Moure, A., Gomez, H., Juanes, R.: A physics-informed deep
learning framework for inversion and surrogate modeling in solid mechanics. Comput.
54. Hamdia, K.M., Ghasemi, H., Bazi, Y., AlHichri, H., Alajlan, N., Rabczuk, T.: A novel deep
learning based method for the computational material design of flexoelectric nanostructures
with topology optimization. Finite Elem. Anal. Des. 165, 21–30 (2019). https://doi.org/10.
1016/j.finel.2019.07.001
55. Han, S., Choi, H.-S., Choi, J., Choi, J.H., Kim, J.-G.: A DNN-based data-driven modeling
employing coarse sample data for real-time flexible multibody dynamics simulations. Comput.
56. Han, Z., De, R.S.: A deep learning-based hybrid approach for the solution of multiphysics
problems in electrosurgery. Comput. Methods Appl. Mech. Eng. 357, 112603 (2019). https://
doi.org/10.1016/j.cma.2019.112603
57. Heider, Y., Wang, K., Sun, W.: SO(3)-invariance of informed-graph-based deep neural network
for anisotropic elastoplastic materials. Comput. Methods Appl. Mech. Eng. 363, 112875
58. Hernandez, Q., Badías, A., González, D., Chinesta, F., Cueto, E.: Deep learning of
thermodynamics-aware reduced-order models from data. Comput. Methods Appl. Mech. Eng.
59. Hou, T.Y., Lam, K.C., Zhang, P. Zhang, S.: Solving Bayesian inverse problems from the
perspective of deep generative networks. Comput. Mech. 64, 395–408 (2019). https://doi.org/
10.1007/s00466-019-01739-7
60. Huang, D., Fuhg, J.N., Weißenfels, C., Wriggers, P.: A machine learning based plasticity
model using proper orthogonal decomposition. Comput. Methods Appl. Mech. Eng. 365,
61. Im, S., Kim, H., Kim, W., Cho, M.: Neural network constitutive model for crystal structures.
Comput. Mech. 67, 185-206 (2021). https://doi.org/10.1007/s00466-020-01927-w
62. Jagtap, A.D., Kharazmi, E., Karniadakis, G.E.: Conservative physics-informed neural
networks on discrete domains for conservation laws: Applications to forward and inverse
problems. Comput. Methods Appl. Mech. Eng. 365, 113028 (2020). https://doi.org/10.1016/
j.cma.2020.113028
63. Jokar, M., Semperlotti, F.: Finite element network analysis: A machine learning based compu-
tational framework for the simulation of physical systems. Comput. Struct. 247, 106484
(2021). https://doi.org/10.1016/j.compstruc.2021.106484
64. Jung, J., Yoon, K., Lee, P.-S.: Deep learned finite elements. Comput. Methods Appl. Mech.
65. Kalogeris, I., Papadopoulos, V.: Diffusion maps-aided Neural Networks for the solution of
parametrized PDEs. Comput. Methods Appl. Mech. Eng. 376, 113568 (2021). https://doi.org/
10.1016/j.cma.2020.113568
66. Kharazmi, E., Zhang, Z., Karniadakis, G.E.M.: hp-VPINNs: Variational physics-informed
neural networks with domain decomposition. Comput. Methods Appl. Mech. Eng. 374,
67. Kiani, J., Camp, C., Pezeshk, S.: On the application of machine learning techniques to derive
seismic fragility curves. Comput. Struct. 218, 108–122 (2019). https://doi.org/10.1016/j.com
pstruc.2019.03.004
68. Kiani, J., Camp, C., Pezeshk, S., Khoshnevis, N.: Application of pool-based active learning
in reducing the number of required response history analyses. Comput. Struct. 241, 106355
69. Kim, D.H., Zohdi, T.I., Singh, R.P.: Modeling, simulation and machine learning for rapid
process control of multiphase flowing foods. Comput. Methods Appl. Mech. Eng. 371, 113286
70. Kissas, G., Yang, Y., Hwuang, E., Witschey, W.R., Detre, J.D., Perdikaris, P.: Machine learning
in cardiovascular flows modeling: Predicting arterial blood pressure from non-invasive 4D
flow MRI data using physics-informed neural networks. Comput. Methods Appl. Mech. Eng.
71. Kneifl, J., Grunert, D., Fehr, J.: A non-intrusive nonlinear model reduction method for struc-
tural dynamical problems based on machine learning. Int. J. Numer. Methods Eng. 122,
4774–4786 (2021). https://doi.org/10.1002/nme.6712
72. Koeppe, A., Bamer, F., Markert, B.: An intelligent nonlinear meta element for elastoplastic
continua: deep learning using a new Time-distributed Residual U-Net architecture. Comput.
73. Le, V., Caracoglia, L.: A neural network surrogate model for the performance assessment
of a vertical structure subjected to non-stationary, tornadic wind loads. Comput. Struct. 231,
106208 (2020). https://doi.org/10.1016/j.compstruc.2020.106208
74. Lejeune, E., Linder, C.: Interpreting stochastic agent-based models of cell death. Comput.
75. Li, H., Kafka, O.L., Gao, J. Yu, C., Nie, Y., Zhang, L., Tajdari, M., Tang, S., Guo, X., Li, G.,
Tang, S., Cheng, G., Liu, W.K.: Clustering discretization methods for generation of material
performance databases in machine learning and design optimization. Comput. Mech. 64,
281–305 (2019). https://doi.org/10.1007/s00466-019-01716-0
76. Li, T., Pan, Y., Tong, K., Ventura, C.E., de Silva, C.W.: A multi-scale attention neural network
for sensor location selection and nonlinear structural seismic response prediction. Comput.
Struct. 248, 106507 (2021). https://doi.org/10.1016/j.compstruc.2021.106507
77. Li, X., Liu, Z., Cui, S., Luo, C., Li, C., Zhuang, Z.: Predicting the effective mechanical property
of heterogeneous materials by image based modeling and deep learning. Comput. Methods
Appl. Mech. Eng. 347, 735–753 (2019). https://doi.org/10.1016/j.cma.2019.01.005
78. Li, X., Ning, S., Liu, Z., Yan, Z., Luo, C., Zhuang, Z.: Designing phononic crystal with
anticipated band gap through a deep learning based data-driven method. Comput. Methods
79. Liu, M., Liang, L., Sun, W.: Estimation of in vivo constitutive parameters of the aortic wall
using a machine learning approach. Comput. Methods Appl. Mech. Eng. 347, 201–217 (2019).
https://doi.org/10.1016/j.cma.2018.12.030
80. Liu, M., Liang, L., Sun, W.: A generic physics-informed neural network-based constitutive
model for soft biological tissues. Comput. Methods Appl. Mech. Eng. 372, 113402 (2020).
References 87
81. Liu, Z.: Deep material network with cohesive layers: Multi-stage training and interfacial
failure analysis. Comput. Methods Appl. Mech. Eng. 363, 112913 (2020). https://doi.org/10.
1016/j.cma.2020.112913
82. Liu, Z., Wu, C.T., Koishi, M.: A deep material network for multiscale topology learning and
accelerated nonlinear modeling of heterogeneous materials. Comput. Methods Appl. Mech.
Eng. 345, 1138–1168 (2019). https://doi.org/10.1016/j.cma.2018.09.020
83. Liu, Z., Wu, C.T., Koishi, M.: Transfer learning of deep material network for seamless struc-
ture–property predictions. Comput. Mech. 64, 451–465 (2019). https://doi.org/10.1007/s00
466-019-01704-4
84. Logarzo, H.J., Capuano, G., Rimoli, J.J.: Smart constitutive laws: Inelastic homogenization
through machine learning. Comput. Methods Appl. Mech. Eng. 373, 113482 (2021). https://
doi.org/10.1016/j.cma.2020.113482
85. Lu, X., Giovanis, D.G., Yvonnet, J., Papadopoulos, V., Detrez, F., Bai, J.: A data-driven compu-
tational homogenization method based on neural networks for the nonlinear anisotropic elec-
trical response of graphene/polymer nanocomposites. Comput. Mech. 64, 307–321 (2019).
https://doi.org/10.1007/s00466-018-1643-0
86. Lye, K.O., Mishra, S., Ray, D., Chandrashekar, P.: Iterative surrogate model optimization
(ISMO): An active learning algorithm for PDE constrained optimization with deep neural
networks. Comput. Methods Appl. Mech. Eng. 374, 113575 (2021). https://doi.org/10.1016/
j.cma.2020.113575
87. Mack, J., Arcucci, R., Molina-Solana, M., Guo, Y.-K.: Attention-based convolutional autoen-
coders for 3D-Variational data assimilation. Comput. Methods Appl. Mech. Eng. 372, 113291
88. Mao, Z., Jagtap, A.D., Karniadakis, G.E.: Physics-informed neural networks for high-speed
flows. Comput. Methods Appl. Mech. Eng. 360, 112789 (2020). https://doi.org/10.1016/j.
cma.2019.112789
89. Meister, F., Passerini, T., Mihalef, V., Tuysuzoglu, A., Maier, A., Mansi, T.: Deep learning
acceleration of Total Lagrangian Explicit Dynamics for soft tissue mechanics. Comput.
90. Meng, X., Li, Z., Zhang, D., Karniadakis, G.E.: PPINN: Parareal physics-informed neural
network for time-dependent PDEs. Comput. Methods Appl. Mech. Eng. 370, 113250 (2020).
91. Nguyen, T.N., Lee, S., Nguyen-Xuan, H., Lee, J.: A novel analysis-prediction approach for
geometrically nonlinear problems using group method of data handling. Comput. Methods
Appl. Mech. Eng. 354, 506–526 (2019). https://doi.org/10.1016/j.cma.2019.05.052
92. Nguyen-Thanh, V.M., Nguyen, L.T.K., Rabczuk, T., Zhuang, X.: A surrogate model for
computational homogenization of elastostatics at finite strain using high-dimensional model
representation-based neural network. Int. J. Numer. Methods Eng. 121, 4811–4842 (2020).
https://doi.org/10.1002/nme.6493
93. Oh, S., Jiang, CH., Jiang, C., Marcus, P.S.: Finding the optimal shape of the leading-and-
trailing car of a high-speed train using design-by-morphing. Comput. Mech. 62, 23–45 (2018).
https://doi.org/10.1007/s00466-017-1482-4
94. Pan, L., Novák, L., Lehký, D., Novák, D., Cao, M.: Neural network ensemble-based sensi-
tivity analysis in structural engineering: Comparison of selected methods and the influence
of statistical correlation. Comput. Struct. 242, 106376 (2021). https://doi.org/10.1016/j.com
pstruc.2020.106376
95. Papanikolaou, S.: Microstructural inelastic fingerprints and data-rich predictions of plasticity
and damage in solids. Comput. Mech. 66, 141–154 (2020). https://doi.org/10.1007/s00466-
020-01845-x
96. Parish, E.J., Carlberg, K.T.: Time-series machine-learning error models for approximate solu-
tions to parameterized dynamical systems. Comput. Methods Appl. Mech. Eng. 365, 112990
97. Patel, D., Tibrewala, R., Vega, A., Dong, L., Hugenberg, N., Oberai, A.A.: Circumventing the
solution of inverse problems in mechanics through deep learning: Application to elasticity
imaging. Comput. Methods Appl. Mech. Eng. 353, 448–466 (2019). https://doi.org/10.1016/
j.cma.2019.04.045
98. Patel, R.G., Trask, N.A., Wood, M.A., Cyr, E.C.: A physics-informed operator regression
framework for extracting data-driven continuum models. Comput. Methods Appl. Mech.
99. Petrolo, M., Carrera, E.: Selection of element-wise shell kinematics using neural networks.
Comput. Struct. 244, 106425 (2021). https://doi.org/10.1016/j.compstruc.2020.106425
100. Phillips, T.R.F., Heaney, C.E., Smith, P.N., Pain, C.C.: An autoencoder-based reduced-order
model for eigenvalue problems with application to neutron diffusion. Int. J. Numer. Methods
Eng. 122, 3780–3811 (2021). https://doi.org/10.1002/nme.6681
101. Pled, F., Desceliers, C., Zhang, T.: A robust solution of a statistical inverse problem in multi-
scale computational mechanics using an artificial neural network. Comput. Methods Appl.
102. Ranade, R., Hill, C., Pathak, J.: DiscretizationNet: A machine-learning based solver for
Navier–Stokes equations using finite volume discretization. Comput. Methods Appl. Mech.
103. Regazzoni, F., Dedè, L., Quarteroni, A.: Machine learning of multiscale active force generation
models for the efficient simulation of cardiac electromechanics. Comput. Methods Appl.
104. Ren, K., Chew, Y., Zhang, Y.F., Fuh, J.Y.H., Bi, G.J.: Thermal field prediction for laser
scanning paths in laser aided additive manufacturing by physics-based machine learning.
112734
105. Rizzo, F., Caracoglia, L.: Artificial Neural Network model to predict the flutter velocity of
suspension bridges. Comput. Struct. 233, 106236 (2020). https://doi.org/10.1016/j.compst
ruc.2020.106236
106. Saha, S., Gan, Z., Cheng, L., Gao, J., Kafka, O.L., Xie, X., Li, H., Tajdari, M., Kim, H.A.,
Liu, W.K.: Hierarchical Deep Learning Neural Network (HiDeNN): An artificial intelligence
(AI) framework for computational science and engineering. Comput. Methods Appl. Mech.
107. Samaniego, E., Anitescu, C., Goswami, S., Nguyen-Thanh, V.M., Guo, H., Hamdia, K.,
Zhuang, X., Rabczuk, T.: An energy approach to the solution of partial differential equations in
computational mechanics via machine learning: Concepts, implementation and applications.
112790
108. Shahriari, M., Pardo, D., Rivera, J.A., Torres-Verdin, C., Picon, A., Ser, J.D. Ossandon, S.,
Calo, V.M.: Error control and loss functions for the deep learning inversion of borehole
resistivity measurements. Int. J. Numer. Methods Eng. 122, 1629–1657 (2021). https://doi.
org/10.1002/nme.6593
109. Sheikholeslami, M., Gerdroodbary, M.B., Moradi, R., Shafee, A., Li, Z.: Application of Neural
Network for estimation of heat transfer treatment of Al2O3-H2O nanofluid through a channel.
Comput. Methods Appl. Mech. Eng. 344, 1–12 (2019). https://doi.org/10.1016/j.cma.2018.
09.025
110. Shishegaran, A., Varaee, H., Rabczuk, T., Shishegaran, G.: High correlated variables creator
machine: Prediction of the compressive strength of concrete. Comput. Struct. 247, 106479
111. Stoffel, M., Gulakala, R., Bamer, F., Markert, B.: Artificial neural networks in structural
dynamics: A new modular radial basis function approach vs. convolutional and feedforward
topologies. Comput. Methods Appl. Mech. Eng. 364, 112989 (2020). https://doi.org/10.1016/
j.cma.2020.112989
112. Sun, L., Gao, H., Pan, S., Wang, J.-H.: Surrogate modeling for fluid flows based on physics-
constrained deep learning without simulation data. Comput. Methods Appl. Mech. Eng. 361,
References 89
113. Tajdari, M., Pawar, A., Li, H., Tajdari, F., Maqsood, A., Cleary, E., Saha, S., Zhang, Y.J.,
Sarwark, J.F., Liu, W.K.: Image-based modelling for Adolescent Idiopathic Scoliosis: Mech-
anistic machine learning analysis and prediction. Comput. Methods Appl. Mech. Eng. 374,
114. Tamaddon-Jahromi, H.R., Chakshu, N.K., Sazonov, I., Evans, L.M., Thomas, H., Nithiarasu,
P.: Data-driven inverse modelling through neural network (deep learning) and computational
heat transfer. Comput. Methods Appl. Mech. Eng. 369, 113217 (2020). https://doi.org/10.
1016/j.cma.2020.113217
115. Tang, M., Liu, Y., Durlofsky, L.J.: Deep-learning-based surrogate flow modeling and geolog-
ical parameterization for data assimilation in 3D subsurface flow. Comput. Methods Appl.
116. Teichert, G.H., Garikipati, K.: Machine learning materials physics: Surrogate optimization
and multi-fidelity algorithms predict precipitate morphology in an alternative to phase field
dynamics. Comput. Methods Appl. Mech. Eng. 344, 666–693 (2019). https://doi.org/10.1016/
j.cma.2018.10.025
117. Teichert, G.H., Natarajan, A.R., Van der Ven, A., Garikipati, K.: Machine learning materials
physics: Integrable deep neural networks enable scale bridging by learning free energy func-
tions. Comput. Methods Appl. Mech. Eng. 353, 201–216 (2019). https://doi.org/10.1016/j.
cma.2019.05.019
118. Teichert, G.H., Natarajan, A.R., Van der Ven, A., Garikipati, K.: Scale bridging materials
physics: Active learning workflows and integrable deep neural networks for free energy func-
tion representations in alloys. Comput. Methods Appl. Mech. Eng. 371, 113281 (2020). https://
doi.org/10.1016/j.cma.2020.113281
119. Tian, J., Qi, C., Sun, Y., Yaseen, Z.M.: Surrogate permeability modelling of low-permeable
rocks using convolutional neural networks. Comput. Methods Appl. Mech. Eng. 366, 113103
120. Viana, F.A.C., Nascimento, R.G., Dourado, A., Yucesan, Y.A.: Estimating model inadequacy
in ordinary differential equations with physics-informed neural networks. Comput. Struct.
245, 106458 (2021). https://doi.org/10.1016/j.compstruc.2020.106458
121. Vlassis, N.N., Ma, R., Sun, W.: Geometric deep learning for computational mechanics Part I:
anisotropic hyperelasticity. Comput. Methods Appl. Mech. Eng. 371, 113299 (2020). https://
doi.org/10.1016/j.cma.2020.113299
122. Vlassis, N.N., Sun, W.: Sobolev training of thermodynamic-informed neural networks for
interpretable elasto-plasticity models with level set hardening. Comput. Methods Appl. Mech.
123. Wang, C., Xu, L.-Y., Fan, J.-S.: A general deep learning framework for history-dependent
response prediction based on UA-Seq2Seq model. Comput. Methods Appl. Mech. Eng. 372,
124. Wang, K., Sun, W.: A multiscale multi-permeability poroplasticity model linked by recur-
sive homogenizations and deep learning. Comput. Methods Appl. Mech. Eng. 334, 337–380
(2018). https://doi.org/10.1016/j.cma.2018.01.036
125. Wang, K., Sun, W.: Meta-modeling game for deriving theory-consistent, microstructure-based
traction–separation laws via deep reinforcement learning. Comput. Methods Appl. Mech. Eng.
346, 216–241 (2019). https://doi.org/10.1016/j.cma.2018.11.026
126. Wang, K., Sun, W.: An updated Lagrangian LBM–DEM–FEM coupling model for dual-
permeability fissured porous media with embedded discontinuities. Comput. Methods Appl.
Mech. Eng. 344, 276–305 (2019). https://doi.org/10.1016/j.cma.2018.09.034
127. Wang, K., Sun, W., Du, Q.: A cooperative game for automated learning of elasto-plasticity
knowledge graphs and models with AI-guided experimentation. Comput. Mech. 64, 467–499
(2019). https://doi.org/10.1007/s00466-019-01723-1
128. Wang, K., Sun, W., Du, Q.: A non-cooperative meta-modeling game for automated third-party
calibrating, validating and falsifying constitutive laws with parallelized adversarial attacks.
113514
129. Wang, L., Chan, Y.-C., Ahmed, F., Liu, Z., Zhu, P., Chen, W.: Deep generative modeling
for mechanistic-based learning and design of metamaterial systems. Comput. Methods Appl.
130. Wang, L., Chen, Z., Yang, G., Sun, Q., Jianli Ge, J.: An interval uncertain optimization method
using back-propagation neural network differentiation. Comput. Methods Appl. Mech. Eng.
131. Wang, L., Liu, Y., Gu, K., Wu, T.: A radial basis function artificial neural network (RBF
ANN) based method for uncertain distributed force reconstruction considering signal noises
and material dispersion. Comput. Methods Appl. Mech. Eng. 364, 112954 (2020). https://doi.
org/10.1016/j.cma.2020.112954
132. Wang, N., Chang, H., Zhang, D.: Efficient uncertainty quantification for dynamic subsurface
flow with surrogate by Theory-guided Neural Network. Comput. Methods Appl. Mech. Eng.
133. Wang, Q., Zhang, G., Sun, C., Wu, N.: High efficient load paths analysis with U* index
generated by deep learning. Comput. Methods Appl. Mech. Eng. 344, 499–511 (2019). https://
doi.org/10.1016/j.cma.2018.10.012
134. Wei, S., Jin, X., Li, H.: General solutions for nonlinear differential equations: a rule-based
self-learning approach using deep reinforcement learning. Comput. Mech. 64, 1361–1374
(2019). https://doi.org/10.1007/s00466-019-01715-1
135. Wessels, H., Weißenfels, C., Wriggers, P.: The neural particle method – An updated Lagrangian
physics informed neural network for computational fluid dynamics. Comput. Methods Appl.
136. White, D.A., Arrighi, W.J., Kudo, J., Watts, S.E.: Multiscale topology optimization using
neural network surrogate models. Comput. Methods Appl. Mech. Eng. 346, 1118–1135
(2019). https://doi.org/10.1016/j.cma.2018.09.007
137. Wu, L., Nguyen, V.D., Kilingar, N.G., Noels, L.: A recurrent neural network-accelerated
multi-scale model for elasto-plastic heterogeneous materials subjected to random cyclic and
non-proportional loading paths. Comput. Methods Appl. Mech. Eng. 369, 113234 (2020).
138. Wu, L., Zulueta, K., Major, Z., Arriaga, A., Noels, L.: Bayesian inference of non-linear
multiscale model parameters accelerated by a Deep Neural Network. Comput. Methods Appl.
139. Wu, P., Sun, J., Chang, X., Zhang, W., Arcucci, R., Guo, Y., Pain, C.C.: Data-driven reduced
order model with temporal convolutional neural network. Comput. Methods Appl. Mech.
140. Xiao, S., Deierling, P., Attarian, S., El Tuhami, A.: Machine learning in multiscale modeling
of spatially tailored materials with microstructure uncertainties. Comput. Struct. 249, 106511
141. Xu, J., Duraisamy, K.: Multi-level convolutional autoencoder networks for parametric predic-
tion of spatio-temporal dynamics. Comput. Methods Appl. Mech. Eng. 372, 113379 (2020).
142. Xu, W., Jiao, Y., Fish, J.: An atomistically-informed multiplicative hyper-elasto-plasticity-
damage model for high-pressure induced densification of silica glass. Comput. Mech. 66,
155-187 (2020). https://doi.org/10.1007/s00466-020-01846-w
143. Yamaguchi, T., Okuda, H.: Zooming method for FEA using a neural network. Comput. Struct.
247, 106480 (2021). https://doi.org/10.1016/j.compstruc.2021.106480
144. Yang, H., Guo, X., Tang, S., Liu, W.K.: Derivation of heterogeneous material laws via data-
driven principal component expansions. Comput. Mech. 64, 365–379 (2019). https://doi.org/
10.1007/s00466-019-01728-w
145. Yang, Y., Perdikaris, P.: Conditional deep surrogate models for stochastic, high-dimensional,
and multi-fidelity systems. Comput. Mech. 64, 417–434 (2019). https://doi.org/10.1007/s00
466-019-01718-y
146. Yao, H., Gao, Y., Liu, Y.: FEA-Net: A physics-guided data-driven model for efficient mechan-
ical response prediction. Comput. Methods Appl. Mech. Eng. 363, 112892 (2020). https://
doi.org/10.1016/j.cma.2020.112892
References 91
147. Yin, M., Zheng, X., Humphrey, J.D., Karniadakis, G.E.: Non-invasive inference of thrombus
material properties with physics-informed neural networks. Comput. Methods Appl. Mech.
148. Zargaran, A., Janoske, U.: Development of an algorithm for reconstruction of droplet history
based on deposition pattern using computational fluid dynamics and convolutional neural
network. Comput. Methods Appl. Mech. Eng. 372, 113442 (2020). https://doi.org/10.1016/j.
cma.2020.113442
149. Zhang, L., Cheng, L., Li, H., Gao, J., Yu, C., Domel, R., Yang, Y., Tang, S., Liu, W.K.:
Hierarchical deep-learning neural networks: finite elements and beyond. Comput. Mech. 67,
207–230 (2021). https://doi.org/10.1007/s00466-020-01928-9
150. Zhang, P., Yin, Z.-Y.: A novel deep learning-based modelling strategy from image of particles
to mechanical properties for granular materials with CNN and BiLSTM. Comput. Methods
151. Zhang, R., Chen, Z., Chen, S., Zheng, J., Büyüköztürk, O., Sun, H.: Deep long short-term
memory networks for nonlinear structural seismic response prediction. Comput. Struct. 220,
55-68 (2019). https://doi.org/10.1016/j.compstruc.2019.05.006
152. Zhang, R., Liu, Y., Sun, H.: Physics-informed multi-LSTM networks for metamodeling of
nonlinear structures. Comput. Methods Appl. Mech. Eng. 369, 113226 (2020). https://doi.
org/10.1016/j.cma.2020.113226
153. Zhang, T., Li, Y., Li, Y., Sun, S., Gao, X.: A self-adaptive deep learning algorithm for accel-
erating multi-component flash calculation. Comput. Methods Appl. Mech. Eng. 369, 113207
154. Zhang, X., Garikipati, K.: Machine learning materials physics: Multi-resolution neural
networks learn the free energy and nonlinear elastic response of evolving microstructures.
113362
155. Zhang, X., Xie, F., Ji, T., Zhu, Z., Zheng, Y.: Multi-fidelity deep neural network surrogate
model for aerodynamic shape optimization. Comput. Methods Appl. Mech. Eng. 373, 113485
156. Zhang, Y., Wen, Z., Pei, H., Wang, J., Li, Z., Yue, Z.: Equivalent method of evaluating mechan-
ical properties of perforated Ni-based single crystal plates using artificial neural networks.
112725
157. Zhu, Q., Liu, Z., Yan, J.: Machine learning for metal additive manufacturing: predicting
temperature and melt pool fluid dynamics using physics-informed neural networks. Comput.
Mech. 67, 619–635 (2021). https://doi.org/10.1007/s00466-020-01952-9
Part II
Case Study
Chapter 4
Numerical Quadrature with Deep
Learning
Abstract It is well known that the element stiffness matrix of a distorted element
calculated using numerical quadrature has a relatively large error. In this chapter,
a method to improve the efficiency of the element integration without degrading
accuracy is studied by employing deep learning.
4.1 Summary of Numerical Quadrature
The numerical quadrature is often used for the element integration of the finite
element method [1, 2], where the integral value is approximated by the sum of the
products of the values of the integrand and the corresponding weights at several points
(coordinates) called integration points. In the Gauss–Legendre numerical quadrature,
which is one of the most popular numerical quadratures in the finite element method,
the integral of a function f (x) in the range of [−1, 1] is approximated as follows:
∫1 Σ
n
f (x)dx ≈ f (xi )Hi (4.1.1)
−1 i=1
where xi is the coordinate of the integration point, Hi the weight at the integration
point xi , and n the number of integration points.
4.1.1 Legendre Polynomials
The coordinates xi and weights Hi of the integration points of the Gauss–Legendre

quadrature above are obtained by using the Legendre and Lagrange polynomials.
First, the n-th order Legendre polynomial Pn is known to be given by

https://doi.org/10.1007/978-3-031-11847-0_4
96 4 Numerical Quadrature with Deep Learning
1 dn ( 2 )n
Pn (x) = n n
x −1 (4.1.2)
2 n! d x
For example, the first-, the second-, and the third-
order Legendre polynomials are, respectively, written as
P1 (x) = x (4.1.3)
1( 2 )
P2 (x) = 3x − 1 (4.1.4)
2
1( 3 )
P3 (x) = 5x − 3x (4.1.5)
2
The Legendre polynomial Pn has a special feature that the integral in the range
of [−1.0, 1.0] of the product of the Legendre polynomial Pn and any polynomial of
the (n−1)-th order Q n−1 (x) is zero, which is written as
∫1
Q n−1 (x)Pn (x) = 0 (4.1.6)
−1
For example, the integral of the product of the cubic Legendre polynomial P3 (x)
and an arbitrary quadratic polynomial ax 2 + bx + c is zero as is shown in
∫1 ∫1
( 2 ) ( 2 )1( 3 )
ax + bx + c P3 (x)dx = ax + bx + c 5x − 3x dx
2
−1 −1
∫1
( ) ⎡ ⏋1
= 5bx 4 − 3bx 2 dx = bx 5 − bx 3 0 = 0 (4.1.7)
0
Note that it is known that the equation given as
Pn (x) = 0 (4.1.8)
{ }
has n different real solutions x1 , x2 , · · · , x n−1 , xn (xi < xi+1 ) in the range of
(–1.0,1.0).
For example, solutions of Eq. (4.1.8) for the case of n = 2 and n = 3 are,
respectively, written as
/ /
1 1
x1 = − , x2 = (n = 2) (4.1.9)
3 3
4.1 Summary of Numerical Quadrature 97
/ /
3 3
x1 = − , x2 = 0, x3 = (n = 3) (4.1.10)
5 5
4.1.2 Lagrange Polynomials

{ }
It is known that, for arbitrary n different values x1 , x2 , · · · , x n−1 , xn (xi < xi+1 ),
we can write Lagrange polynomials of the (n-1)-th order L in−1 (x) as follows:
(x − x1 ) · · · (x − xi−1 )(x − xi+1 ) · · · (x − xn )

L in−1 (x) = (4.1.11)
(xi − x1 ) · · · (xi − xi−1 )(xi − xi+1 ) · · · (xi − xn )
The Lagrange polynomial of the (n-1)-th order, L in−1 (x), has the following
properties.
{
( ) 1 (i = j)
L in−1 xj = = δi j (Kronecker delta) (4.1.12)
0 (i /= j)
Based on the properties shown above, we can easily describe a polynomial

function (y = f (x)) as follows:
Σ
n
y = f (x) = yi · L in−1 (x) (4.1.13)
i=1
which interpolates n different points {(x1 , y1 ), (x2 , y2 ), · · · , (xn−1 , yn−1 ), (xn , yn )}

(x
{ i < xi+1 ) using } corresponding Lagrange polynomials defined with
x1 , x2 , · · · , x n−1 , xn .
Here, L in−1 (x) {is the Lagrange }polynomial of the (n-1)-th order
constructed from x1 , x2 , · · · , x n−1 , xn (see Eq. (4.1.11)). It is easily
found that the polynomial function Eq. (4.1.13) interpolates the n points
{(x1 , y1 ), (x2 , y2 ), · · · , (xn−1 , yn−1 ), (xn , yn )} as follows:
( ) Σ n
( ) Σ n
y = f xj = yi · L in−1 x j = yi · δi j = y j (4.1.14)
i=1 i=1
4.1.3 Formulation of Gauss–Legendre Quadrature
Here, we study the formulation of the Gauss–Legendre quadrature using the Legendre
and Lagrange polynomials, both described above.
As discussed
{ above, for the } n-th order Legendre polynomial Pn (x), n different real
values x1 , x2 , · · · , x n−1 , xn (xi < xi+1 ) are obtained as the solutions of Pn (x) = 0
Eq. (4.1.8).
Using these n values, we can construct n Lagrange polynomials of the (n-1)-th
order, L in−1 (x), based on Eq. (4.1.11).
For an arbitrary integrand f (x), we { let yi = f (xi ), (i =}1, · · · , n) be the value of
the function at the above n solutions x1 , x2 , · · · , x n−1 , xn and define a (2n − 1)-th
order polynomial Q 2n−1 (x) as follows:
Σ
n
Q 2n−1 (x) = Rn−1 (x) · Pn (x) + yi · L in−1 (x) (4.1.15)
i=1
where Rn−1 (x) is an arbitrary polynomial of the (n − 1)-th order, Pn (x) the n-th
{ and L i (x) the (n} − 1)-th order Lagrange polynomial
n−1
order Legendre polynomial,
defined with n solutions x1 , x2 , · · · , x n−1 , xn of Pn (x) = 0.
{ Then, we can }see that Q 2n−1 (x) is equal to f (x) at solutions
x1 , x2 , · · · , x n−1 , xn of the n-th order Legendre polynomial Pn as follows:
( ) ( ) ( ) Σ n
( )
Q 2n−1 x j = Rn−1 x j · Pn x j + yi · L in−1 x j
i=1
( ) Σ
n
( )
= Rn−1 x j · 0 + yi · δi j = y j = f x j (4.1.16)
i=1
This means that the (2n-1)-th order polynomial Q 2n−1 (x) can be regarded as an
approximate polynomial of f (x), and thus the integral of f (x) can be approximated
by the integral of Q 2n−1 (x) as
∫1 ∫1
f (x)dx ≈ Q 2n−1 (x)dx
−1 −1
∫1 Σ
n ∫1
= Rn−1 (x)Pn (x)dx + yi L in−1 (x)dx (4.1.17)
−1 i=1 −1
Here, the first term of the right-hand side of this equation is zero due to the property
of Eq. (4.1.6), and we obtain
∫1 Σ
n ∫1
f (x)dx ≈ yi L in−1 (x)dx (4.1.18)
−1 i=1 −1
The above equation suggests that the definite integral value of the left-hand side
is approximated by the sum in the right-hand side. If Hi Eq. (4.1.19) is defined as
∫1
Hi = L in−1 (x)dx (4.1.19)
−1
we finally obtain
∫1 Σ
n Σ
n
f (x)dx ≈ yi Hi = f (xi )Hi (4.1.20)
−1 i=1 i=1
This is equivalent to Eq. (4.1.1).

The values of Hi for n = 2 and n = 3 are, respectively, calculated as
H1 = 1.0, H2 = 1.0 (n = 2) (4.1.21)
5 8 5
H1 = , H2 = , H3 = (n = 3) (4.1.22)
9 9 9
{ Equation (4.1.20) is } called the Gauss–Legendre quadrature, where the solutions

x1 , x2 , · · · , x n−1 , xn of Eq. (4.1.8) are used as the integration points, and the sum of
the products of the values of the integrand yi and the weights according to Eq. (4.1.19)
at the integration points gives the approximate value of the integral of the function
f (x).
For example, let us consider the following integral of the exponential function.
∫1
⎡ ⏋1 1
e x dx = e x 0 = e − = 2.3504023872876028 · · · (4.1.23)
e
−1
By using the Gauss–Legendre quadrature, Eq. (4.1.20), this integral can be

approximated as
∫1 Σ
n
e x dx ∼
= e xi · Hi = e x1 · H1 + e x2 · H2 + · · · + e xn · Hn (4.1.24)
−1 i=1
-2
Fig. 4.1 Accuracy of 10
Gauss–Legendre quadrature
-4
10
-6
10
-8
10
Error
-10
10
-12
10
-14
10
-16
10
1 2 3 4 5 6 7 8
Number of Quadrature Points
where n is the number of integration points. The error of the approximate value
obtained by the Gauss–Legendre quadrature is shown in Fig. 4.1. The horizontal axis
shows the number of integration points and the vertical axis the absolute value of
the difference between the true and the approximate values obtained by the Gauss–
Legendre quadrature. The calculations are performed using double-precision real
numbers. It is shown that the accuracy is very high when using more than eight
integration points. As can be seen from this example, the accuracy is improved by
increasing the number of integration points. Table 4.1 shows the coordinates and
weights of the integration points up to 10 integration points.
The Gauss–Legendre quadratures in two and three dimensions are defined as
natural extension of that of one dimension, respectively, as.
Two-dimensional Gauss–Legendre quadrature:
∫1 ∫1 Σ
n Σ
m
( )
f (x, y)dxdy ≈ f xi , y j · Hi j (4.1.25)
−1 −1 i=1 j=1
Three-dimensional Gauss–Legendre quadrature:
∫1 ∫1 ∫1 Σ
n Σ
m Σ
l
( )
f (x, y, z)dxdydz ≈ f xi , y j , z k · Hi jk (4.1.26)
−1 −1 −1 i=1 j=1 k=1
Here, n, m, and l are the number of integration points in each axis.

Table 4.1 Gauss–Legendre quadrature parameters

Total number of quadrature points Points Weights
2 ±0.5773502691896257 1.0000000000000000
3 0.0000000000000000 0.8888888888888888
±0.7745966692414834 0.5555555555555554
4 ±0.3399810435848563 0.6521451548625462
±0.8611363115940526 0.3478548451374537
5 0.0000000000000000 0.5688888888888889
±0.5384693101056831 0.4786286704993665
±0.9061798459386640 0.2369268850561890
6 ±0.2386191860831969 0.4679139345726911
±0.6612093864662646 0.3607615730481386
±0.9324695142031521 0.1713244923791706
7 0.0000000000000000 0.4179591836734694
±0.4058451513773972 0.3818300505051190
±0.7415311855993945 0.2797053914892767
±0.9491079123427585 0.1294849661688696
8 ±0.1834346424956498 0.3626837833783620
±0.5255324099163290 0.3137066458778874
±0.7966664774136267 0.2223810344533745
±0.9602898564975363 0.1012285362903763
9 0.0000000000000000 0.3302393550012598
±0.3242534234038089 0.3123470770400026
±0.6133714327005905 0.2606106964029354
±0.8360311073266359 0.1806481606948574
±0.9681602395076261 0.0812743883615745
10 ±0.1488743389816312 0.2955242247147529
±0.4333953941292472 0.2692667193099963
±0.6794095682990244 0.2190863625159821
±0.8650633666889845 0.1494513491505805
±0.9739065285171716 0.0666713443086882
4.1.4 Improvement of Gauss–Legendre Quadrature
As described so far, the Gauss–Legendre quadrature with n integration points is

equivalent to approximating the integrand function by a (2n-1)-th degree polynomial
and obtaining the integral value of the polynomial as that of the function. It is more
accurate than the Newton–Cotes quadrature [1], which approximates the integrand
function by a (n-1)-th degree polynomial using n integration points, and is regarded
as an excellent general-purpose numerical integration method.
However, it is seen that there is room for improvement in some cases. Let us
consider the following integral.
∫1 ⎡ ]1
x7 1 −1 2
x 6 dx = = − = = 0.2857142857 · · · (4.1.27)
7 −1 7 7 7
−1
If we calculate this integral using the Gauss–Legendre quadrature with two

integration points, the error is so large as shown in
∫1 ( / )6 (/ )6
1 1 2
x dx ≈ 1.0 × −
6
+ 1.0 × = = 0.074074074 · · · (4.1.28)
3 3 27
−1
Calculating the integral using the Gauss–Legendre quadrature with three integra-
tion points, we still have some error as
∫1 ( / )6 (/ )6
5 3 8 5 3 6
x 6 dx ≈ × − + × 06 + × = = 0.24 (4.1.29)
9 5 9 9 5 25
−1
Note that the correct values of the integral are obtained, if the weights are changed
from 1.0 to 27
7
in Eq. (4.1.28) and from 95 to 2521
in Eq. (4.1.29), respectively.
Instead, changing the coordinates of integration / points /can reduce the error.
For example, by changing the coordinates from ± 1
3
to ± 6 1
7
in the case of two
integration points above, we get the correct value as
∫1 ( / )6 (/ )6
6 1 6 1 2
x dx = 1.0 × −
6
+ 1.0 × = = 0.2857142857 · · · (4.1.30)
7 7 7
−1
The standard coordinates and weights of the Gauss–Legendre quadrature are deter-
mined only by the number of integration points, and the same coordinates and weights
are used for any integrand function. On the other hand, the best coordinates and
weights to be used for improving accuracy of integral naturally differ depending on
integrated function.
4.2 Summary of Stiffness Matrix for Finite Element Method 103
4.2 Summary of Stiffness Matrix for Finite Element

Method
The isoparametric element is popular among the finite element community, in which
the displacements {u} and the coordinates {x} at any point in an element are approx-
imated using the displacements and the coordinates of the nodes and the shape func-
tions Ni (ξ, η, ζ ) [1, 2]. For the three-dimensional case, {u} and {x} in an element
are, respectively, expressed as
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
u u(ξ, η, ζ ) Σ n Ui
{u} = ⎝ v ⎠ = ⎝ v(ξ, η, ζ ) ⎠ = Ni (ξ, η, ζ ) · ⎝ Vi ⎠ (4.2.1)
w w(ξ, η, ζ ) i=1 Wi
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
x x(ξ, η, ζ ) Σ n Xi
⎝ ⎠ ⎝
{x} = y = y(ξ, η, ζ ) = ⎠ ⎝
Ni (ξ, η, ζ ) · Yi ⎠ (4.2.2)
z z(ξ, η, ζ ) i=1 Zi
where n is the total number of nodes of the element and (Ui , Vi , Wi )T and
(X i , Yi , Z i )T the displacements and the coordinates of the i-th node of the element,
respectively.
Next, we define a vector {U } of all the nodal displacements in an element and a
matrix [N ] of shape functions as
⎛ ⎞
U1
⎜ ⎟
⎜ ⎟
V1
⎜ ⎟
⎜ ⎟
W1
⎜ ⎟..
{U } = ⎜
⎜
⎟
⎟ . (4.2.3)
⎜ ⎟
⎜ Un ⎟
⎜ ⎟
⎝ Vn ⎠
Wn
⎡ ⎤
N1 0 0 Nn 0 0
[N ] = ⎣ 0 N1 0 · · · 0 Nn 0 ⎦ (4.2.4)
0 0 N1 0 0 Nn
Then, the displacement {u} at a given point in the element is expressed by the
nodal displacement vector {U } as follows:
⎛ ⎞
U1
⎜ ⎟
⎜ V1 ⎟
⎛Σ ⎞ ⎡ ⎤⎜ ⎟
Ni Ui N1 0 0 Nn 0 0 ⎜⎜
W1 ⎟
⎟
Σ ..
{u} = ⎝ Ni Vi ⎠ = ⎣ 0 N1 0 · · · 0 Nn 0 ⎦⎜
⎜ .
⎟ = [N ]{U } (4.2.5)
⎟
Σ
Ni Wi 0 0 N1 0 0 Nn ⎜⎜ Un
⎟
⎟
⎜ ⎟
⎝ Vn ⎠
Wn
The strain {ε} and stress {σ } at a given point in an element are also written using
the nodal displacement vector {U } as
⎛ ⎞ ⎛ ⎞ ⎡ ⎤
∂u ∂
εx ∂x
0 0 ∂x
⎜ ⎟ ⎜ ∂v ⎟ ⎢ 0 ∂ 0 ⎥
⎜ εy ⎟ ⎜ ∂y ⎟ ⎢ ∂y ⎥⎛ ⎞
⎜ ⎟ ⎜ ⎟ ⎢ ⎥ u
⎟ ⎜ ⎟ ⎢ 0 0 ∂z ⎥
∂w ∂
⎜ εz
{ε} = ⎜ ⎟=⎜ ∂z ⎟=⎢ ∂ ∂ ⎥⎝ v ⎠ = [L]{u} = [L][N ]{U }
⎜ γx y ⎟ ⎜ ∂u
+ ∂∂vx ⎟ ⎢
⎟ ⎢ ∂y ∂x 0
⎥
⎜ ⎟ ⎜ ∂y ⎥ w
⎝ γ yz ⎠ ⎜
⎝
∂v
+ ∂w
⎟ ⎢ ∂ ∂
⎠ ⎣ 0 ∂z ∂ y
⎥
⎦
∂z ∂y
γzx ∂u
+ ∂w ∂
0 ∂∂x
∂z ∂x ∂z
(4.2.6)
⎛ ⎞ ⎛ ⎞
σx εx
⎜ σ ⎟ ⎜ ε ⎟
⎜ y ⎟ ⎜ y ⎟
⎜ ⎟ ⎜ ⎟
⎜ σ ⎟ ⎜ εz ⎟
{σ } = ⎜ z ⎟ = [D]⎜ ⎟ = [D]{ε} = [D][L][N ]{U } (4.2.7)
⎜ τx y ⎟ ⎜ γx y ⎟
⎜ ⎟ ⎜ ⎟
⎝ τ yz ⎠ ⎝ γ yz ⎠
τzx γzx
where [D] is the stress–strain matrix. The product [L][N ] is often denoted as[B],
referred to as the strain–displacement matrix. For a three-dimensional isotropic
elastic body, [D] is given as
⎡ ⎤
ν ν
1 1−ν 1−ν
0 0 0
⎢ ν ν ⎥
⎢ 1 1−ν 0 0 0 ⎥
⎢ 1−ν
ν ν ⎥
E(1 − ν) ⎢ ⎢
1 0 0 0 ⎥
⎥
[D] = 1−ν 1−ν
(4.2.8)
(1 + ν)(1 − 2ν) ⎢ ⎥
1−2ν
⎢ 0 0 0 2(1−ν)
0 0 ⎥
⎢ 0 1−2ν ⎥
⎣ 0 0 0 2(1−ν)
0 ⎦
1−2ν
0 0 0 0 0 2(1−ν)
where E and ν are, respectively, Young’s modulus and Poisson’s ratio.

Spatial discretization of the equation of equilibrium for the static problem of a
structure yields the matrix equation as follows:
4.2 Summary of Stiffness Matrix for Finite Element Method 105
{ }
[K ] U G = {F} (4.2.9)
{ }
where U G is a vector of displacements of all the nodes in the structure, [K ]
the global stiffness matrix, and {F} the load vector. The global stiffness matrix is
constructed by assembling all the element stiffness matrices of the whole structure
as
Σ
ne
⎡ e⏋
[K ] = k (4.2.10)
e=1
Since the size of the global stiffness matrix [K ] is different from that of the
element stiffness matrix [k e ], the summation in Eq. (4.2.10) is performed in such a
way as each component of the element stiffness matrix is added to the corresponding
position of the global stiffness matrix, rather than simple summation.
The element stiffness matrix is given using the stress–strain matrix [D] and the
strain–displacement matrix [B] as
∫
⎡ e⏋
k = [B]T [D][B]dv (4.2.11)
ve
where v e means that the entire element is taken as the integral domain.
The element integration is performed by transforming the coordinates from the
real space (x yz space) to the parameter space (ξ ηζ space) and then using the Gauss–
Legendre quadrature. Figure 4.2 shows the coordinate transformation in the two-
dimensional case, while that in the three-dimensional space results in the integration
over the [−1, 1] × [−1, 1] × [−1, 1] region in the ξ ηζ space as follows:
˚ ∫1 ∫1 ∫1
⎡ e⏋
k = [B] [D][B]dxdydz =
T
[B]T [D][B] · |J | · dξ dηdζ (4.2.12)
ve −1 −1 −1
where the Jacobian matrix [J ] in the coordinate transformation is calculated by

summing the products of the derivatives of the basis function and the nodal
coordinates from Eq. (4.2.2) as
⎡ ⎤
Σ
n
∂ Ni Σ
n
∂ Ni Σ
n
∂ Ni
⎡ ∂x ∂y ∂z
⎤ ⎢ i=1 ∂ξ
Xi ∂ξ
Z iYi
⎥ ∂ξ
⎢ n i=1 i=1 ⎥
⎢
∂ξ ∂ξ ∂ξ
∂z ⎥
⎢ Σ Σn Σn ⎥
[J ] = ⎣ ∂x ∂y ⎢ ∂ Ni ∂ Ni ∂ Ni
Z ⎥
∂η ⎦ = ⎢
∂η ∂η ∂η
Xi ∂η i
Y ∂η i ⎥ (4.2.13)
∂x ∂y ∂z ⎢ i=1 i=1 i=1 ⎥
∂ζ ∂ζ ∂ζ ⎣Σ n
∂ Ni Σn
∂ Ni Σn
∂ Ni ⎦
∂ζ
Xi ∂ζ i
Y ∂ζ
Zi
i=1 i=1 i=1
y η
4 (1,1)
3
4 3
1 0 ξ
2
1 2
0 x
( - 1, - 1)
Fig. 4.2 Coordinate transformation for numerical quadrature in the two-dimensional space
Inversing the Jacobian matrix above, each component of the strain–displacement

matrix [B], a derivative of the basis function with respect to x, y, and z, is calculated
as follows:
⎛ ⎞ ⎛ ∂N ⎞
∂ Ni i
∂ξ
⎜ ∂∂Nxi ⎟ −1 ⎜ ∂ N ⎟
⎝ ∂ y ⎠ = [J ] ⎝ ∂ηi ⎠ (4.2.14)
∂ Ni ∂ Ni
∂z ∂ζ
Thus, the element stiffness matrix calculated using the Gauss–Legendre quadra-
ture is written as
l
l
⎡ e⏋ Σ n Σm Σ l
( T )l
k ≈ [B] [D][B] · |J | ll · Hi, j,k (4.2.15)
i=1 j=1 k=1 l ξ =ξi
η=η j
ζ =ζk
where n, m, and l are, respectively, the numbers of

( integration ) points along each axis,
and Hi, j,k is the weight at the integration point ξi , η j , ζk , which is written as
Hi, j,k = Hi · H j · Hk (4.2.16)
If we take an eight-noded hexahedral linear isoparametric element, the nodal

arrangement is shown in Fig. 4.3. Then, the stiffness matrix of the element has 24
rows and 24 columns and the basis functions of the element are given as
1
N1 (ξ, η, ζ ) = (1 − ξ )(1 − η)(1 − ζ ) (4.2.17)
8
1
N2 (ξ, η, ζ ) = (1 + ξ )(1 − η)(1 − ζ ) (4.2.18)
8
4.3 Accuracy Dependency of Stiffness Matrix on Numerical Quadrature 107
Fig. 4.3 8-noded linear ζ η

solid element
8
7
(1,1,1)
5 6
1 2
(-1,-1,-1)
1
N3 (ξ, η, ζ ) = (1 + ξ )(1 + η)(1 − ζ ) (4.2.19)
8
1
N4 (ξ, η, ζ ) = (1 − ξ )(1 + η)(1 − ζ ) (4.2.20)
8
1
N5 (ξ, η, ζ ) = (1 − ξ )(1 − η)(1 + ζ ) (4.2.21)
8
1
N6 (ξ, η, ζ ) = (1 + ξ )(1 − η)(1 + ζ ) (4.2.22)
8
1
N7 (ξ, η, ζ ) = (1 + ξ )(1 + η)(1 + ζ ) (4.2.23)
8
1
N8 (ξ, η, ζ ) = (1 − ξ )(1 + η)(1 + ζ ) (4.2.24)
8
4.3 Accuracy Dependency of Stiffness Matrix on Numerical

Quadrature
When an element stiffness matrix (Sect. 4.2) of the finite element method is calculated
using the numerical integration method (Sect. 4.1), its accuracy usually depends on
the shape of the element. In this section, we discuss how to quantitatively evaluate
the error and some clues to improve the accuracy.
Using the Gauss–Legendre quadrature, the element stiffness matrix [k] in the finite
element method is represented by the sum of the product of the value of integrand
function and the weight at each integration point as
∫ Σ
NG
⎡ ( )⏋
[k] = [B] [D][B]dv ≈
T
F ξg , ηg , ζg wg (4.3.1)
ve g=1
)matrix, v the
e
where [B] is the strain–displacement matrix, [D] the stress–strain
(
element domain, NG the total number of⎡integration
( points,
)⏋ ξg , ηg , ζg the coordinate
values of the g-th integration point, F ξg , ηg , ζg the element stiffness matrix
calculated only with the contribution of the g-th integration point, and wg the weight
of the g-th integration point. Note that, in the standard Gauss–Legendre quadrature,
the coordinate values and the weights of the integration points are independent of
the integrand function, but the standard common values of them, which only depend
on the number of integration points, are used in common for any integrand function.
Since the Gauss–Legendre quadrature uses a polynomial as the approximation of
the integrand, it is inevitable that errors may occur in the integration. As it is clear from
Eq. (4.3.1), the computational complexity is proportional to the number of integration
points, resulting in the use of the moderate number of them. It is also known that
the shape of the element has a great influence on the accuracy of the elemental
integration; a square-shaped element in two-dimensional space and a cubic-shaped
element in three-dimensional space can be integrated with very high accuracy even
with a small number of integration points, whereas the accuracy decreases rapidly
with the distortion level of the element.
Consider an eight-noded hexahedral element of cubic shape with (0, 0, 0)-(1, 1, 1)
as the diagonal as shown in Fig. 4.4 with basis functions given as Eqs. (4.2.17)–
(4.2.24). Let us study the accuracy of the element integration when distortion is
introduced into the shape of the element by changing the position of nodes other
than node P0 in the figure. In Fig. 4.5, the element A has a cubic shape, the element
B some degree of distortion, and the element C further degree of distortion. The
coordinates of each nodal point of elements B and C are shown in Table 4.2. The
results are shown in Fig. 4.6, where the horizontal axis is the number of integration
points per axis and the vertical axis the error index (Error) of the numerical quadrature
of the element stiffness matrix, which is defined by
Σ ll g ⎡ exact ⏋ ll
ij l[k ]ij − k ijl
Error = l⎡ ⏋ l l (4.3.2)
l
maxl k exact i j l
ij
⎡ ⏋
where [k g ] is the element stiffness matrix calculated with g integration points, k exact
is the exact element stiffness matrix, and []i j denotes the component located at the
i-th row and the j-th column of the matrix. Since it is difficult to obtain the exact
element stiffness matrix, that calculated with 30 integration points per axis, i.e.,
27,000 integration points per element, is considered to be the exact one.
Fig. 4.4 Linear hexahedron P7 P6 (1,1,1)

element
P5
P4
P2
P0 (0,0,0) P1
Fig. 4.5 Elements tested for convergence of elemental integration
It can be seen from the figure that the almost converged element stiffness matrix of
the element A with perfect cubic shape is obtained with only two integration points
per axis, while the convergence speed slows down as the shape distortion grows, and
the number of integration points required to reach a prescribed accuracy increases.
Table 4.2 Tested elements

x y z
B P0 0.00000 0.00000 0.00000
P1 1.00453 −0.15174 −0.11027
P2 1.19228 0.83914 −0.03755
P3 0.14488 1.02768 0.18917
P4 −0.06256 0.18433 1.11555
P5 1.16262 −0.03545 1.10962
P6 0.94371 0.93326 0.95980
P7 −0.16756 0.92073 0.96290
C P0 0.00000 0.00000 0.00000
P1 1.03243 −0.04702 −0.10338
P2 0.83863 0.86761 0.19397
P3 0.14383 1.15863 −0.17659
P4 −0.09155 −0.01175 0.98928
P5 0.84633 −0.00810 0.98282
P6 1.00336 1.19198 1.18690
P7 −0.03052 0.83998 1.14285
Fig. 4.6 Convergence of 1

10
elemental integration Element A
-1 Element B
10 Element C
-3
10
-5
10
Error
-7
10
-9
10
-11
10
-13
10
0 5 10 15 20 25 30
Number of Quadrature Points per Axis
The degradation of the accuracy of the numerical quadrature of an element stiff-

ness matrix due to the distorted shape of the element has been recognized for a long
time, and the accuracy improvement by symbolic manipulation has been studied
[3, 4].
The elemental integration is an important process not only in the finite element
method but also in such areas as isogeometric analysis [5–16].
Instead of the conventional method to improve the elemental integration, let
us study a method to improve the accuracy of the Gauss–Legendre quadrature by
selecting the quadrature parameters element by element.
The optimal quadrature parameters in the elemental integration can be obtained
as the parameters that minimize L as follows:
∥ ∥
∥ NG ∥
∥Σ ⎡ ( )⏋ ⎡ exact ⏋∥
∥
L=∥ F ξg , ηg , ζg wg − k ∥ (4.3.3)
∥
∥ g=1 ∥
⎡ ⏋
where NG is the total number of integration points per element, k exact is the true
element stiffness matrix, which is usually substituted by that calculated with a large
number of integration points (e.g., 30 points per axis, or 27,000 points per element),
and ∥∥ means to take the sum of the squares of each matrix component. Equa-
tion (4.3.2) is defined as the ratio of the difference to the maximum component of the
matrix to make the index comparable between matrices of different elements with
different shapes, while Eq. (4.3.3) is a simple norm intending comparison between
matrices of the same element with different quadrature parameters.
Let the standard values of the coordinates and weights of the g-th
( integration
) point
in the Gauss–Legendre quadrature be, respectively, denoted as ξg0 , ηg0 , ζg0 and wg0 ,
( )
then ξg , ηg , ζg and wg can be expressed as
ξg = ξg0 + Δξg (g = 1, · · · , NG ) (4.3.4)
ηg = ηg0 + Δηg (g = 1, · · · , NG ) (4.3.5)
ζg = ζg0 + Δζg (g = 1, · · · , NG ) (4.3.6)
( )
wg = wg0 1 + Δwg (g = 1, · · · , NG ) (4.3.7)
Using the notations above, the sum of squared errors in Eq. (4.3.3) is given as
L = L(Δξ , Δη, Δζ , Δw) (4.3.8)
( )
where, for example, Δξ = Δξ1 , Δξ2 , · · · , Δξ NG .
Defining the fitness to be the inverted ratio of the L value to that with standard
quadrature parameters (= L(0, 0, 0, 0)) as
L(0, 0, 0, 0)
Fitness (Δξ , Δη, Δζ , Δw) = (4.3.9)
L(Δξ , Δη, Δζ , Δw)
we obtain the optimal quadrature parameters as (Δξ , Δη, Δζ , Δw) that result in the
maximum fitness.
Here, the effect of (Δξ , Δη, Δζ , Δw) on the fitness is studied for the case where
the element stiffness matrix of an eight-noded hexahedral element is calculated using
the numerical quadrature with eight integration points (two points in each axis).
Using the element of the cubic shape with the (0, 0, 0)-(1, 1, 1) being the diagonal
as a reference (See Fig. 4.4), we generate distorted elements by changing the position
of the nodal point P6 .
Figure 4.7 shows the change in the fitness when the coordinate ξ of each integration
point is shifted for the element with the coordinates of P6 being (1,1,2). Here, the
vertical axis is the fitness, and when it exceeds 1.0, it means that the element stiffness
matrix calculated with the modified quadrature parameters is closer to the true matrix
than that calculated with standard quadrature parameters. The horizontal axis is the
amount of change in the coordinate value of the integration point. For each of the eight
integration points, the change in fitness when the coordinate ξ of the integration point
is moved is shown. For example,
( 001 in the figure means the integration point whose
√ √ √ )
standard coordinates are −1/ 3, −1/ 3, 1/ 3 and 010 that whose standard
( √ √ √ )
coordinates are −1/ 3, 1/ 3, −1/ 3 .
√
Results for integration points where the standard ζ -coordinate is 1/ 3 are shown
with markers. Note that some lines overlap due to symmetry. Figures 4.8 and 4.9
show the change in the fitness when the η and ζ coordinates of each integration
point are moved, respectively. The latter figure shows √ that the fitness is improved by
moving the integration point with ζ -coordinate √ 1/ 3 in the plus(+) direction and
the integration point with ζ -coordinate −1/ 3 in the minus(−) direction.
It is also seen that the degree of improvement in the fitness is different for each
integration point. Figure 4.10 shows the change in the fitness when the weights of
each integration point are changed for the same element, depicting that the fitness is
improved by increasing the weight at any integration point.
Figure 4.11 shows the change in the fitness when the coordinate ξ of each inte-
gration point is shifted with the position of P6 being (2,1,1), while Figs. 4.12 and
4.13 show those when the coordinates η and ζ of each integration point are shifted,
respectively. The change in the fitness when the weight of an integration point is
changed is also shown in Fig. 4.14.
These results show that the fitness greater than 1.0 can be obtained by changing any
of the quadrature parameters, that more accurate numerical quadrature of an element
Fig. 4.7 Fitness versus Δξ P6 (1,1,2)

in the element with 1.2
P6 (1, 1, 2)
1.0
0.8
Fitness
0.6 000
001
010
0.4
011
100
0.2 101
110
111
0.0
-0.02 -0.01 0 0.01 0.02
Δξ
Fig. 4.8 Fitness versus Δη P 6 (1,1,2)

P6 (1, 1, 2)
1.0
0.8
Fitness
0.6 000
001
010
0.4
011
100
0.2 101
110
111
0.0
-0.02 -0.01 0 0.01 0.02
Δη
stiffness matrix can be achieved by optimizing the quadrature parameters for each
element, that the change in the fitness values near the optimum one is gradual, and
that changing the coordinates of the integration points results in better fitness than
changing the weights of them.
Fig. 4.9 Fitness versus Δζ P 6 (1,1,2)

P6 (1, 1, 2)
1.0
0.8
Fitness
0.6 000
001
010
0.4
011
100
0.2 101
110
111
0.0
-0.02 -0.01 0 0.01 0.02
Δζ
Fig. 4.10 Fitness versus P 6 (1,1,2)

Δw in the element with 1.2
P6 (1, 1, 2)
1.0
0.8
Fitness
0.6 000
001
010
0.4 011
100
101
0.2
110
111
0.0
-0.02 -0.01 0 0.01 0.02
Δw
4.4 Search for Optimal Quadrature Parameters
In the previous section, it is shown that the accuracy of the numerical quadrature
can be improved by changing quadrature parameters and that the degree of improve-
ment is quantitatively evaluated by the fitness defined in Eq. (4.3.9). In this section,
4.4 Search for Optimal Quadrature Parameters 115
Fig. 4.11 Fitness versus Δξ P 6 (2,1,1)

P6 (2, 1, 1)
1.0
0.8
Fitness
0.6 000
001
010
0.4 011
100
101
0.2
110
111
0.0
-0.02 -0.01 0 0.01 0.02
Δξ
Fig. 4.12 Fitness versus Δη P 6 (2,1,1)

P6 (2, 1, 1)
1.0
0.8
Fitness
0.6 000
001
010
0.4 011
100
101
0.2 110
111
0.0
-0.02 -0.01 0 0.01 0.02
Δη
defining the optimal quadrature parameters as (Δξ , Δη, Δζ , Δw) that maximize the
fitness for a given number of integration points, a method for obtaining the optimal
quadrature parameters for each element is discussed.
The quadrature parameters are classified into two categories: the coordinates of
the integration points Δξ , Δη, and Δζ , and the weights of the integration points Δw.
Fig. 4.13 Fitness versus Δζ P 6 (2,1,1)

P6 (2, 1, 1)
1.0
0.8
Fitness
0.6 000
001
010
0.4
011
100
0.2 101
110
111
0.0
-0.02 -0.01 0 0.01 0.02
Δζ
Fig. 4.14 Fitness versus P 6 (2,1,1)

Δw in the element with 1.2
P6 (2, 1, 1)
1.0
0.8
Fitness
0.6 000
001
010
0.4 011
100
0.2 101
110
111
0.0
-0.02 -0.01 0 0.01 0.02
Δw
With regard to the increase in the computational load when performing the numerical
quadrature for an element stiffness matrix using optimal quadrature parameters, the
change in the weights of integration points Δw requires only a few modifications in
the program for the element stiffness matrix causing little increase in the computa-
tional load, whereas the change in the coordinates of integration points Δξ , Δη, and
Δζ requires additional changes in the program related to basis functions, which often
increases the computational load significantly. In addition, the number of coordinate
values Δξ , Δη, and Δζ to be tuned is three times as large as that of weights Δw.
On the other hand, as we have seen in Sect. 4.3, optimization of the coordinates
of integration points may provide a higher degree of improvement in the accuracy
of the elemental integration than that of the weights alone.
In any case, an efficient method for searching the optimal parameters is required.
In [17], the weights Δw are optimized employing a random search to find the optimal
parameters. On the other hand, the set of optimal parameters (Δξ , Δη, Δζ , Δw),
maximizing the fitness defined in Eq. (4.3.9) or equivalently minimizing the error
defined in Eq. (4.3.8), are efficiently obtained by using various evolutionary computa-
tion algorithms, which have another advantage that it is easy to add various constraints
to the individual parameters to be tuned while using Eq. (4.3.9) as the target function
to be maximized.
Here, we study an efficient search method for the optimal quadrature parameters
using the evolutionary algorithms, which are optimization algorithms inspired by
the evolution and behavior of living things, such as genetic algorithm (GA) [18, 19],
which imitates the evolution of lives, artificial bee colony algorithm (ABC) [20],
which imitates the foraging behavior of honeybees, particle swarm optimization
(PSO) [21], which mimics the swarming behavior of birds and fish, firefly algorithm
(FA) [22], which mimics the courtship behavior of fireflies, and bat algorithm (BA)
[23], which mimics the reverberant localization behavior of bats. They are often
called the swarm intelligence [24]. Evolutionary computation algorithms have been
applied to a variety of engineering problems [25–27].
In this section, PSO is employed among others to search for the optimal quadrature
parameters.
When the number of parameters to be optimized is N p , the i-th individual x in (or
its equivalent) and its speed v in at the n-th generation (or n-th iteration) in PSO are
represented by a one-dimensional array, respectively, as follows:
( )
x in = xi,1
n
, xi,2
n
, xi,3
n
, · · · , xi,N
n
p −2
, xi,N
n
p −1
, xi,N
n
p
(4.4.1)
( )
v in = vi,1
n
, vi,2
n
, vi,3
n
, · · · , vi,N
n
p −2
, v n
, v n
i,N p −1 i,N p (4.4.2)
where xi,n j is the j-th component of the coordinates indicating the position in N p -
dimensional space of the i-th individual of the n-th generation, and vi,n j is its velocity.
The coordinates of each individual correspond to the set of quadrature parameters
(Δξ , Δη, Δζ , Δw), which allow the fitness value of each individual to be calculated
from Eq. (4.3.9).
The update equations for v in and for x in are, respectively, written as follows:
( ) ( )
v in = αv in−1 + β g n − x in × r nd + γ pin − x in × r nd (4.4.3)
x in+1 = x in + v in (4.4.4)
where g n is the best individual in the population at the n-th generation; pin is the best
of the i-th individual up to the n-th generation; α, β, and γ are constants; and rnd is
a random number in the range [0.0, 1.0].
The flowchart of PSO is shown in Fig. 4.15. After the initial population is gener-
ated, the best individual of each generation and the best of each individual up to the
current generation are determined, and each individual is repeatedly updated along
the directions to these best individuals. PSO is an algorithm that uses the gradient
to the best individual, and once a good individual is found, it is expected to have a
good convergence in such real-valued searches as the present case.
Here, PSO is used to search for the optimal quadrature parameters for an eight-
noded hexahedral element. The number of integration points is set to eight (two
for each axis). Using the element of cubic shape shown in Fig. 4.4, 100 hexahe-
dral elements are generated by randomly shifting all nodal coordinates in the range
of [−0.2, 0.2] from the reference position. And, for these elements generated, the
optimal quadrature parameters, which maximize Eq. (4.3.9), are searched by PSO,
where the number of individuals is set to 1,000 and that of generations (steps) 10,000,
the maximum change in the coordinates of integration points within ±0.1, and the
change in the weights at integration points within ±10% of the standard weights.
The quadrature parameters for the search are 24 coordinate values and 8 weights of
Fig. 4.15 Flowchart of PSO

start
initial particles
evaluation
yes
stop criterion
end
satisfied
no
update global and personal best
update particles
eight integration points; then the length of the array representing an individual is set
to 32.
The structure of an array of the quadrature parameters for each individual is shown
as
(
x in = (Δξ1 )in , (Δη1 )in , (Δζ1 )in , · · · , (Δξ8 )in ,
)
(Δη8 )in , (Δζ8 )in , (Δw1 )in , · · · , (Δw8 )in (4.4.5)
Changing the random numbers, 10 trials of the optimal quadrature parameter

search are performed by PSO for each of the elements with various shapes, where a
uniform random number is used, Ryzen7 3700X CPU is used for all the tests, and
the time required for 10,000 generations of evolution in PSO is about 74 s.
Calculation of the fitness of each individual is computationally demanding
because it is calculated using Eqs. (4.3.3) and (4.3.9) on the element stiffness matrix
obtained by the element integration using the quadrature parameters represented by
the individual.
Out of 1000 trials (100 elements, 10 trials for each element), 26 trials have failed to
find better quadrature parameters than the standard ones. The maximum, average, and
minimum values of the fitness obtained in 10 trials with different random numbers
are taken to be the result of optimal quadrature parameter search by PSO for each
element; then the average value of the fitness over 100 elements is determined.
The quadrature parameters subject to the optimal value search by PSO are clas-
sified into the coordinates of the integration points and their weights. We have
conducted experiments for three cases: when all the parameters are subject to opti-
mization (denoted as C & W), when only the coordinates of the integration points
are subject to optimization with the weights fixed to the standard values (denoted
as C), and when only the weights of the integration points are optimized with the
coordinates of the integration points fixed to the standard values (denoted as W).
Figure 4.16 shows the result that the improvement in fitness by optimizing the
coordinates of the integration points is much larger than that by optimizing the
weights.
On the other hand, Fig. 4.17 shows the correlation between the fitness obtained
by optimizing all parameters (C & W) and that by optimizing only the coordinates
of integration points (C), where each point corresponds to an element. It can be seen
from the figure that almost all the points are below the diagonal, indicating that the
fitness is improved by optimizing the coordinates and weights simultaneously and
that there is a clear correlation between the two results (C & W and C), and in other
words, elements with high fitness values in C & W have the same tendency in C.
Figure 4.18 shows the correlation between the fitness obtained by optimizing only
the coordinates (C) and that obtained by optimizing only the weights (W). It can be
seen from the figure that all the points are under this diagonal line, indicating equal
fitness between two methods, which means that optimization of the coordinates (C)
always results in a greater fitness than optimization of the weights only (W).
Fig. 4.16 Fitnesses obtained 6.0

by optimizing parameters
C&W
5.0 C
W
4.0
Fitness
3.0
2.0
1.0
0.0
Best Average Worst
Fig. 4.17 Effect of 15

parameters to be optimized
Fitness obtained by optimizing C
10
0
0 5 10 15
Fitness obtained by optimizing C&W
From both figures above, it can be inferred that an element for which high fitness
value is obtained by one of the three optimization methods (i.e., an element for which
the fitness value can be greatly improved) will have high fitness value by any of the
other methods, indicating a strong dependency on the element shape.
The distribution of the best fitness obtained when all parameters are optimized,
i.e., the best in Fig. 4.16, is shown in Fig. 4.19, indicating that the fitness varies
Fig. 4.18 Effect of 2.5

parameters to be optimized
Fitness obtained by optimizing W

2.0
1.5
1.0
0.5
0.0
0 2 4 6 8 10
Fitness obtained by optimizing C
element by element significantly. The shape of the element with the largest fitness is
shown in Fig. 4.20a and that with the smallest fitness in Fig. 4.20b.
It is concluded that PSO is effective in finding the optimal quadrature parameters
for each element, with which a more accurate element stiffness matrix is obtained
than with the standard Gauss–Legendre quadrature parameters.
Fig. 4.19 Distribution of 20

fitnesses
15
Number of Elements
10
0
1 3 5 7 9 11 13 15
Fitness
(a) Highest Fitness
(b) Lowest Fitness
Fig. 4.20 Elements tested
4.5 Search for Optimal Number of Quadrature Points
In Sect. 4.4, we have discussed how to improve the accuracy of the elemental integra-
tion by optimizing the quadrature parameters with the number of integration points
fixed. On the other hand, the accuracy improvement of the elemental integration
can also be achieved by increasing the number of integration points. In the present
section, we study the optimal number of integration points to achieve a predetermined
accuracy and the effect of elemental shape on the optimal number.
As shown in Fig. 4.6, the convergence rate of the element integration depends on
the shape of the element. In the case of an eight-noded hexahedral element of cubic
shape, an accurate element stiffness matrix is obtained even with two integration
4.5 Search for Optimal Number of Quadrature Points 123
points per axis (or total number of integration points is eight), whereas for an element
of irregular shape, a large number of integration points are required to obtain an
accurate element stiffness matrix. Note that a good accuracy above means that the
difference between the element stiffness matrix under consideration and that obtained
with a very large number of integration points or Eq. (4.3.2) is small.
Here, the optimal number of integration points for an element is defined as
the minimum number of integration points per axis for which the error defined in
Eq. (4.3.2) is less than a predefined value (threshold). In the case of Fig. 4.6 with
the threshold set to 10−7 , the optimal number of integration points for element A is
2, that for element B is 5, and that for element C is 8, where the same number of
integration points per axis is assumed for all the axes.
In large-scale finite element analysis, the number of elements employed is huge
and the shapes of them are various. Consider the same number of integration points is
assumed for each element in the domain. Then, the accuracy of the calculated element
stiffness matrix may vary element by element as discussed above. Therefore, it seems
reasonable to perform the numerical quadrature for the element stiffness matrix using
the optimal number of integration points for each element.
Let the eight-noded hexahedral element of a cubic shape (Fig. 4.4) be a reference.
We generate a number of elements of various shapes from the above element by
translating all the nodes within a range of ± 0.2 along each axis, but the nodes P0
and P1 are fixed at (0,0,0) and (1,0,0), respectively, and the z-coordinate of node
P3 is fixed to 0 [17]. Using this method, a total of 100,000 elements are generated
and the optimal numbers of integration points for each element for three different
threshold values are calculated, and then the numbers of elements classified by the
optimal number of integration points are tabulated (Table 4.3). For example, when
the threshold is set to 10−7 , 58,467 out of 100,000 elements are found to have an
optimal integration point of 5 per axis.
Table 4.3 Optimal number of quadrature points for 100,000 elements

Optimal number of quadrature Threshold = 10−6 Threshold = 10−7 Threshold = 10−8
points
4 11,464 542 11
5 78,855 58,467 20,496
6 9288 37,852 63,982
7 367 2917 14,120
8 22 196 1236
9 4 22 130
10 0 2 20
11 0 2 2
12 0 0 3
4.6 Deep Learning for Optimal Quadrature of Element

Stiffness Matrix
Based on the discussions above, two methods for optimizing the elemental integration
using deep learning are discussed here: Sect. 4.6.1 describes the estimation of optimal
quadrature parameters and Sect. 4.6.2 that of the optimal number of integration points.
4.6.1 Estimation of Optimal Quadrature Parameters by Deep

Learning
As we have seen in Sect. 4.3, the optimization of the Gauss–Legendre quadrature or

the improvement of accuracy of the integral value is achieved by integrating each inte-
grand using its own optimal quadrature parameters, i.e., coordinates and weights of
integration points. However, it is not practical to prepare optimal quadrature param-
eters for a large number and variety of integrand. Then, we consider here to target
only a specific group of the integrand for the optimization.
Since the integrand of an element integral in the finite element method usually
consists of the derivative of basis functions, it is identified by a finite number of
parameters, e.g., the coordinate values of the nodes in the element.
However, it is difficult to prepare in advance the optimal quadrature parameters
for each integrand in the form of a table.
Therefore, instead of preparing optimal quadrature parameters for each element
in advance, we regard the correspondence from elemental parameters to optimal
quadrature parameters as a mapping. In other words, we need only to prepare the
rules for deriving the optimal quadrature parameters from the elemental parameters in
order to realize the optimized numerical quadrature. Here, we employ deep learning
to construct this mapping and extract the rules.
Denoting the parameters describing the element shape (nodal coordinates, etc.)
and the number of integration points used in the elemental integration, respectively,
as {e − parameters} and n, m, and l in each axis, the method for obtaining the optimal
quadrature parameters for the element integration by deep learning is summarized
as
(1) Data
{ optPreparation Phase: }Calculate the optimal quadrature parameters
Δξ , Δηopt , Δζ opt , Δw opt for a large number of elements with various
shapes. (See Sect.{ 4.4) This results in a large } number of data pairs as
({e − parameters}, Δξ opt , Δηopt , Δζ opt , Δw opt ).
(2) Training Phase: Deep learning is performed using the data pairs above, setting
the input data and teacher data, respectively, as follows:
Input data: {e {− parameters} }
Teacher data: Δξ opt , Δηopt , Δζ opt , Δw opt
4.6 Deep Learning for Optimal Quadrature of Element Stiffness Matrix 125
The trained neural network above works as

Input data: {e{− parameters} }
Output data: Δξ DL , Δη DL , Δζ DL , Δw DL
(3) Application Phase: The neural network trained in the above is incorpo-
rated in the elemental integration process of the analysis code. Specifi-
cally, when {e − parameters} of a new element are given to the trained
{neural network as input, the } estimated optimal quadrature parameters
Δξ DL , Δη DL , Δζ DL , Δw DL are promptly output, which are used for the
numerical quadrature to calculate the element stiffness matrix of the element.
The calculation of optimal quadrature parameters in the Data Preparation Phase is
a computationally demanding process, because we need to obtain a highly accurate
element stiffness matrix with a large number of integration points for each of a number
of elements of different shape and to search for the optimal quadrature parameters
using an evolutionary computation algorithm with this matrix as a reference.
In addition, the Training Phase is also a computationally demanding process.
However, these two phases can be performed independently of the Application Phase
by using dedicated high-speed computers.
On the other hand, the inference in the Application Phase is usually expected to
be performed in a short time due to its small computational load.
4.6.2 Estimation of Optimal Number of Quadrature Points

by Deep Learning
In this section, we study to develop the rules that derive the optimal number of inte-
gration points from the element shape parameters, where deep learning is employed
to construct the rules behind the correspondence.
The method for obtaining the optimal number of integration points in the elemental
integration by deep learning is summarized in the following three phases, where the
parameters to describe the element shape (nodal coordinates, etc.) are denoted as
{e − parameters}, and the number of integration points used in element integration
by n, m and l in each axis, respectively. Here, it is assumed to be n = m = l.
(1) Data Preparation Phase: Setting a threshold, calculate the optimal number of
integration points n opt for a large number of elements with various shapes. (See
Sect. 4.5) This yields a large number of data pairs ({e − parameters}, n opt ).
(2) Training Phase: Deep learning is performed on the data pairs obtained above,
where input and teacher data are, respectively, set as follows:
Input data: {e − parameters}
Teacher data: n opt
Once the training is done, the input and output of the trained neural network
are, respectively, given as follows:
Input data: {e − parameters}

Output data: n DL
(3) Application Phase: The neural network trained in the Training Phase above is
incorporated in the elemental integration process of the analysis code. Specifi-
cally, when {e − parameters} of a new element are given to the trained neural
network as input, the estimated values of the optimal number of integration
points n DL are promptly output and then used for the numerical quadrature for
the element stiffness matrix of the element.
The determination of the optimal number of integration points in the Data Prepara-
tion Phase is a computationally demanding process, since we need to obtain a highly
accurate element stiffness matrix using a large number of integration points and then
repeatedly perform the element integration sequentially increasing the number of
integration points starting from a small number until the Error in Eq. (4.3.2) falls
below the preset threshold. This process is repeatedly performed for a lot of elements
of different shapes. In addition, the Training Phase, where data pairs collected in the
Data Preparation Phase are used as training patterns, is also a computationally heavy
process.
However, these two phases can be performed independently of the Application
Phase, say, by using some dedicated high-speed computers.
On the other hand, the calculation of the optimal number of integration points in
the Application Phase is performed by users, computational load of which is expected
to be not so heavy.
4.7 Numerical Example A
In this section, we discuss an example of the application of deep learning to the

estimation of the optimal number of integration points in the element integration
presented in Sect. 4.6.2. Specifically, taking the elemental integration of an 8-node
hexahedral element as the target, the coordinates of the nodes constituting the element
are used as the shape parameters of the element, and a feedforward neural network is
constructed, that outputs the optimal number of integration points n opt of the element
when the coordinates of the nodes are input.
4.7.1 Data Preparation Phase
First, training patterns are generated by random sampling from the data created in
Sect. 4.5. Among the eight nodes in an element, the node P0 is fixed to (0,0,0), the
node P1 to (1,0,0), and the z-coordinate of the node P3 to 0. Thus, 17 coordinate
values of the remaining nodes are used as {e − parameters} to define the element
shape. The threshold value of Error [Eq. (4.3.2)] in the element integration is set to
4.7 Numerical Example A 127
10−7 . As can be seen from Table 4.3, the optimal number of integration points is
distributed from 4 to 11 for 100,000 elements. It is noted that the number of elements
with the optimal number of integral greater than 8 is less than 50, which seems very
small.
Here, we define the problem to classify elements into the following five categories.
Category 1: n opt = 4 (542 elements belong to this category.)
Category 2: n opt = 5 (58,467 elements belong to this category.)
Category 5: n opt ≥ 8 (222 elements belong to this category.)
4.7.2 Training Phase
In this section, we construct a feedforward neural network that estimates the optimal
number of integration points n opt from the element shape. The input and output data
(teacher data) of the neural network are set as follows:
Input data: 17 nodal coordinates of the 8 nodes in an element
Teacher data: n opt , the optimal number of integration points for the element.
Regarding the structure of the feedforward neural network used here, the numbers
of units in the input and output layers are automatically determined by the number
of input data and that of teacher data, respectively. In the present case, the number
of units in the input layer is 17. That in the output layer is set to 5 as the one-hot
encoding is used as the teacher data. In other words, we have the same number of
outputs as the number of categories (5 in this case), and only the unit corresponding
to the correct category outputs 1, while the other units output 0.
On the other hand, the number of hidden layers and that of units in each hidden
layer are often determined from various combinations by trial and error. Here, we
have decided to choose as the structure of a feedforward neural network from the
candidate combinations as follows:
The numbers of hidden layers: 1, 2, 3, 4, 5, and 6
The numbers of units per hidden layer: 20, 40, 60, and 80
Note, in the case of two or more hidden layers, the numbers of units in each hidden
layer are assumed to be the same.
50,000 training patterns are randomly selected from the 100,000 patterns collected
in the Data Preparation Phase; then, out of them, five different numbers of training
patterns (5000, 10,000, 20,000, 30,000, and 50,000) are employed to check the effect
of the number of training patterns.
In addition, 10,000 patterns are randomly selected from the remaining 50,000
patterns to be used as patterns for verifying the generalization ability of the neural
network after training.
As described above, we choose the best training condition from 120 conditions,
including 6 different numbers of hidden layers, 4 of units per hidden layer, and 5 of
training patterns. All other conditions such as the learning coefficients are common,
and the number of training epochs is set to 10,000.
For comparison of the neural networks trained with various conditions, the
average estimation error of the patterns for verification of the generalization ability
is employed, which is defined as
1 Σ
NV Σ 5
lp l
Error N N = l O j − pTj l (4.7.1)
N V p=1 j=1
where p O j is the output of the j-th unit of the output layer for the p-th input pattern,
p
T j the corresponding teacher data, and N V the number of patterns for verification
of the generalization capability (10,000 in this case).
It is known in the training of neural networks that the results trained are affected
by the initial values of the connection weights. Therefore, in order to reduce the
influence of the initial value of the connection weight, five training sessions with
different initial values are performed under the same training condition, and the best
result among them is considered to represent the training condition.
Figures 4.21, 4.22, 4.23 and 4.24 show the training results, where the horizontal
axis is the number of hidden layers, and the vertical axis is the average estimation
error Error N N defined in Eq. (4.7.1). Figure 4.21 shows the results with 20 units per
hidden layer, where label TP05U20 depicts the results with 5000 training patterns
and TP50U20 those with 50,000 training patterns. Similarly, Figs. 4.22, 4.23, and
4.24 show the results when the number of units per hidden layer is 40, 60, and 80,
respectively.
All the results show that the accuracy improves as the number of training data
is increased. In most cases, in addition, the accuracy is improved by increasing the
number of hidden layers and units per hidden layer.
4.7.3 Application Phase
Based on the results of Sect. 4.7.2, we consider here to estimate the optimal number
of integration points using the neural network trained with 50,000 training patterns,
where, out of neural networks trained with 50,000 training patterns, the neural
network with 5 hidden layers and 40 units per hidden layer is employed. Some
neural networks, including that with 6 hidden layers and 80 units per hidden layer,
show smaller average estimation errors than the selected one, but the difference is
small. So, the selection has been done in consideration of reducing the amount of
computation during estimation.
4.7 Numerical Example A 129
Fig. 4.21 Error versus 0.5

number of hidden layers (20
units per layer)
0.4
0.3
Error
0.2
TP05U20
TP10U20
0.1 TP20U20
TP30U20
TP50U20
0.0
1 2 3 4 5 6
Number of Hidden Layers

units per layer)
0.4
0.3
Error
0.2
TP05U40
TP10U40
0.1 TP20U40
TP30U40
TP50U40
0.0
1 2 3 4 5 6
To study the generalization capability of the neural network employed, Table 4.4
shows the results of the estimation of the optimal number of integration points for
the patterns (10,000 patterns). Since one-hot encoding is used as the output method,
the category corresponding to the output unit that outputs the maximum value for
each input pattern is judged as the estimated category (the optimal number of integral

units per layer)
0.4
0.3
Error
0.2
TP05U60
TP10U60
0.1 TP20U60
TP30U60
TP50U60
0.0
1 2 3 4 5 6

units per layer)
0.4
0.3
Error
0.2
TP05U80
TP10U80
0.1 TP20U80
TP30U80
TP50U80
0.0
1 2 3 4 5 6
points). The table shows that the percentage of correct classification is more than
91%, indicating that most of the misclassifications are those into adjacent categories.
Thus, the optimal number of integration points, shown to be estimated from the
element geometry, can be used to reduce the computational load of the element
integration process.
4.8 Numerical Example B 131
Table 4.4 Optimal number of quadrature points estimated by deep learning

n DL Total
4 5 6 7 8
n opt 4 9 38 0 0 0 47
5 3 5478 355 0 0 5836
6 0 332 3414 49 0 3795
7 0 0 97 199 2 298
8 0 0 0 10 14 24
4.8 Numerical Example B
In this section, we show another example of application of deep learning to the

estimation of the optimal number of integration points in the elemental integration
of an 8-node hexahedral element. Here, a feedforward neural network is constructed
that estimates the optimal number of integration points n opt of the element using
some features of the element shape.
The same element data and the teacher data as in Sect. 4.7 are employed
in the Data Preparation Phase. As the input data for the neural network,
seven shape features: AlgebraicShapeMetric, MaxEdgeLength, MinEdgeLength,
MaxEdgeAngle, MinEdgeAngle, MaxFaceAngle, and MinFaceAngle are employed
here. Computer codes for calculating these features are described in Sect. 9.1.2, and
each of them is detailed as follows:
AlgebraicShapeMetric: A quantity defined based on the condition number of a
hexahedral element [28], which takes values in the range [0.0,1.0] and is 1.0 for a
perfect cubic shape.
MaxEdgeLength and MinEdgeLength: The maximum and minimum edge lengths
of a hexahedral element, respectively. Note that, obviously, MaxEdgeLength ≥ 1.0
and MinEdgeLength ≤ 1.0 hold for the elements to be considered here.
MaxEdgeAngle and MinEdgeAngle: The maximum and minimum angles
between the three edges starting from each vertex of a hexahedral element,
respectively.
MaxFaceAngle and MinFaceAngle: The maximum and minimum angles between
the two faces that share an edge of a hexahedral element, respectively.
The optimal number of integration points n opt is defined as the minimum number
of integration points per axis with which Error in Eq. (4.3.2) is less than the threshold;
i.e., n opt is defined as the number of integration points satisfying the following
equation.
( ) ( )
Error n opt − 1 > threshold ≥ Error n opt (4.8.1)
where Error(k) is the error when integrating an element with k integration points per
axis.
The difference between the error and the threshold when integrating with n opt inte-
gration points depends on the elements. In order to evaluate the relationship between
the shape features and the convergence of the elemental integration, we introduce
opt
n r as a more detailed indicator of the convergence of the elemental integration,
which is defined as the number of integration points (non-integer, hypothetical real
value), where Error in Eq. (4.3.2) is exactly equal to the threshold as given by the
following equations.
( )
Error n ropt = threshold (4.8.2)
( ) ( ) ( )
Error n opt − 1 ≥ Error n ropt = threshold ≥ Error n opt (4.8.3)
n opt − 1 ≤ n ropt ≤ n opt (4.8.4)
Since the value of Error is almost linear in a one-logarithmic graph as shown in

opt
Fig. 4.6, n r is determined using interpolation as
( )
( opt ) log(threshold) − logError n opt − 1
n ropt = n −1 + (4.8.5)
logError(n opt ) − logError(n opt − 1)
The relationship between each shape feature and the optimal number of integration
opt
points (real value) n r is shown in Figs. 4.25, 4.26, 4.27, 4.28, 4.29, 4.30 and 4.31.
Figure 4.25 shows the relationship between the shape feature, AlgebraicShapeMetric,
opt
and the optimal number of integration points (real value) n r , where the horizontal
opt
axis is the AlgebraicShapeMetric value, and the vertical axis is n r as displayed
for 5000 randomly selected elements. It is clear from the figure that the smaller the
opt
AlgebraicShapeMetric value is, the larger the value of n r tends to be.
opt
Figures 4.26 and 4.27 show the relationship between MinEdgeLength and n r ,
opt
and that between MaxEdgeLength and n r , respectively. It can be seen from the
figures that the degree of correlation is relatively small.
opt
Figures 4.28 and 4.29 show the relationship between MinEdgeAngle and n r ,
opt
and that between MaxEdgeAngle and n r , respectively, while Figs. 4.30 and 4.31
opt
depict that between MinFaceAngle and n r , and that between MaxFaceAngle and
opt
n r , respectively. It can be seen from the figures that the degree of correlation is
strong.
As described above, all of the shape features are correlated with the convergence
of the elemental integration; then it seems reasonable to construct a neural network
to estimate the optimal number of integration points n opt using these shape features
as input.
Fig. 4.25 Relation between 9

AlgebraicShapeMetric and
Optimal Number of Quadrature Points

optimal number of
quadrature points 8
3
0.5 0.6 0.7 0.8 0.9 1.0
AlgebraicShapeMetric

MinEdgeLength and optimal
number of quadrature points

8
3
0.6 0.7 0.8 0.9 1.0
MinEdgeLength
Here, a feedforward neural network that estimates the optimal number of integration
points n opt from element shape parameters is constructed. The input data and output
data (teacher data) of the neural network are, respectively, set as follows:

8
3
1.0 1.1 1.2 1.3 1.4 1.5
MaxEdgeLength
Fig. 4.27 Relation between MaxEdgeLength and optimal number of quadrature points
9
3
20 30 40 50 60 70 80 90
MinEdgeAngle
Fig. 4.28 Relation between MinEdgeAngle and optimal number of quadrature points

8
3
90 100 110 120 130 140 150 160
MaxEdgeAngle
Fig. 4.29 Relation between MaxEdgeAngle and optimal number of quadrature points
9
3
50 60 70 80 90
MinFaceAngle
Fig. 4.30 Relation between MinFaceAngle and optimal number of quadrature points

MaxFaceAngle and optimal

number of quadrature points
8
3
90 100 110 120 130 140 150
MaxFaceAngle
Input data: Seven shape features of an element.

Teacher data: Optimal number of numerical integration points of the element:
n opt .
The number of units in the output layer is set to five, and one-hot encoding is
again used as the teacher data. Here, the same number of outputs as the number of
categories (five in this case) is used, and only the unit corresponding to the correct
category outputs 1, while the other units output 0.
The number of hidden layers and that of units per hidden layer are determined by
referring to the structure finally adopted in Sect. 4.7 as follows:
The number of hidden layers: 5
The number of units per hidden layer: 40
The other conditions for the training are set as the same as those in Sect. 4.7.
We show here the results of the estimation of the optimal number of integration points
for the generalization capability verification patterns (10,000 patterns) in Table 4.5.
Since one-hot encoding is used as the output method, the category corresponding to
the output unit that outputs the maximum value for each input pattern is determined
to be the predicted category (the optimal number of integration points). It can be seen
References 137
Table 4.5 Optimal number of quadrature points estimated by deep learning

n DL Total
4 5 6 7 8
n opt 4 3 44 0 0 0 47
5 17 4819 999 1 0 5836
6 0 1199 2498 97 1 3795
7 0 2 165 119 12 298
8 0 0 3 9 12 24
from Table 4.5 that the level of correct classification is more than 74%, indicating
that most of the misclassifications are those into adjacent categories.
As shown above, it is also possible to estimate the optimal number of integration
points for an element from the shape features of the element, but the accuracy is
inferior to the estimation using the nodal coordinate values. However, it may be
possible to achieve higher accuracy by the better selection of set of shape features
and the use of nodal coordinate values in combination with the shape features.
References
1. Bathe, K. J.: Finite Element Procedures. Prentice-Hall (1996)

2. Hughes, T. J. R.: The Finite Element Method : Linear Static and Dynamic Finite Element
Analysis. Dover (2000)
3. Kikuchi, M.: Application of the symbolic mathematics system to the finite element program.
Comput. Mech. 5, 41–47 (1989)
4. Yagawa, G., Ye, G. -W., Yoshimura, S.: A numerical integration scheme for finite element
method based on symbolic manipulation. Int. J. Numer. Methods Eng. 29, 1539–1549 (1990)
5. Ait-Haddou, R., Barton, M., Calo, V. M.: Explicit Gaussian quadrature rules for C1 cubic
splines with symmetrically stretched knot sequences. J. Comput. Appl. Math. 290, 543–552
(2015)
6. Barton, M., Calo, V. M.: Optimal quadrature rules for odd-degree spline spaces and their
application to tensor-product-based isogeometric analysis. Comput. Methods Appli. Mech.
Eng. 305, 217–240 (2016)
7. Bittencourt, M. L., Vanzquez, T. G.: Tensor-based Gauss-Jacobi numerical integration for
high-order mass and stiffness matrices. Int. J. Numer. Methods Eng. 79, 599–638 (2009)
8. Hansbo, P.: A new approach to quadrature for finite elements incorporating hourglass control
as a special case. Comput. Methods Appl. Mech. Eng. 158, 301–309 (1998)
9. Hughes, T. J. R., Cottrell, J. A., Bazilevs, Y.: Isogeometric Analysis: CAD, finite elements,
NURBS, exact geometry, and mesh refinement. Comput. Methods Appl. Mech. Eng. 194,
4135–4195 (2005)
10. Johannessen, K. A.: Optimal quadrature for univariate and tensor product splines. Comput.
11. Liu, W. K., Guo, Y., Tang, S., Belytschko, T.: A multiple-quadrature eight-node hexahedral
finite element for large deformation elastoplastic analysis. Comput. Methods Appl. Mech. Eng.
154, 69–132 (1998)
12. Mousavi, S. E., Xiao, H., Sukumar, N.: Generalized Gaussian quadrature rules on arbitrary
polygons. Int. J. Numer. Methods Eng. 82, 99–113 (2010)
13. Nagy, A. P., Benson, D. J.: On the numerical integration of trimmed isogeometric elements.
Comput. Methods Appl. Mech. Eng. 284, 165–185 (2015)
14. Rajendran, S.: A technique to develop mesh-distortion immune finite elements. Comput.
15. Schillinger, D., Hossain, S. J., Hughes, T. J. R.: Reduced Bezier element quadrature rules for
quadratic and cubic splines in isogeometric analysis. Comput. Methods Appl. Mech. Eng. 277,
1–45 (2014)
16. Sevilla, R., Fernandez-Mendez, S.: Numerical integration over 2D NURBS-shaped domains
with applications to NURBS-enhanced FEM. Finite Elem. Anal. Des. 47, 1209–1220 (2011)
17. Oishi, A., Yagawa, G.: Computational mechanics enhanced by deep learning. Comput. Methods
Appl. Mech. Eng. 327, 327–351 (2017)
18. Goldberg, D. E.: Genetic Algorithms in Search, Optimization & Machine Learning. Addison-
Wesley (1989)
Verlag (1992)
20. Karaboga, D., Basturk, B.: A powerful and efficient algorithm for numerical function
optimization: artificial bee colony (ABC) algorithm. J. Global Optim. 39(3), 459–471 (2007)
21. Kennedy, J., Eberhart, R.: Particle Swarm Optimization. In: Proceedings of IEEE International
Conference on Neural Networks, IV, pp. 1942–1948 (1995)
22. Yang, X. S.: Nature-Inpsired Metaheursitic Algorithms, Luniver Press, Frome, UK (2008)
23. Yang, X. S.: A new metaheuristic bat-Inspired algorithm. Studies in Computational Intelligence
284, 65–74, Springer (2010)
24. Bonabeau, E., Dorigo, M., Theraulaz, G.: Swarm Intelligence: From Natural to Artificial
Systems. Oxford University Press (1999)
25. Botello, S., Marroquin, J. L., Oñate, E., Van Horebeek, J.: Solving structural optimization
problems with genetic algorithms and simulated annealing. Int. J. Numer. Methods Eng. 45(5),
1069–1084 (1999)
26. Parpinelli, R. S., Teodoro, F. R., Lopes, H. S.: A comparison of swarm intelligence algorithms
for structural engineering optimization. Int. J. Numer. Methods Eng. 91, 666–684 (2012)
27. Vieira, I. N., Pires de Lima, B. S. L., Jacob, B. P.: Bio-inspired algorithms for the optimization
of offshore oil production systems. Int. J. Numer. Methods Eng. 91, 1023–1044 (2012)
28. Knupp, P. M.: A method for hexahedral mesh shape optimization. Int. J. Numer. Methods Eng.
58, 319–332 (2003)
Chapter 5
Improvement of Finite Element Solutions
with Deep Learning
Abstract The accuracy of the FEM solution is known to be improved when dividing
the analysis domain into smaller elements, while the computation time increases
explosively. In this chapter, we discuss a method to improve the accuracy of the
FEM solution with a small number of elements using error information and deep
learning.
5.1 Accuracy Versus Element Size
In the finite element method (FEM), the static problem of a structure is reduced to
the following simultaneous linear equations.
{ }
[K ] U G = {F} (5.1.1)
{ }
where U G is the displacement vector of all the nodes { in }the domain, [K ] the global
stiffness matrix, and {F} the load vector. The size of U G is the product of the total
number of nodes and the degrees of freedom per node. The global stiffness matrix
[K ] is constructed by assembling all the element stiffness matrices in the analysis
domain as shown in
Σ
ne
[ e]
[K ] = k (5.1.2)
e=1
where [k e ] is the element stiffness matrix of the eth element and n e the total number
of elements in the domain.
The displacements of each node are obtained by solving Eq. (5.1.1), from which
the displacements, strains, and stresses at an arbitrary location in the analysis domain
are calculated. If the shape function of an element is fixed, the accuracy of the
calculated physical quantities (such as displacements, strains, and stresses) depends
https://doi.org/10.1007/978-3-031-11847-0_5
140 5 Improvement of Finite Element Solutions with Deep Learning
Fig. 5.1 Two-dimensional

stress analysis
(4,4)
y
x (0,0)
on the element size, meaning that a lot of small elements should be used to obtain a
high accuracy.
A simple example of the domain of a two-dimensional stress analysis is shown
in Fig. 5.1, where the bottom surface is fixed and the load is applied to the half of
the top surface, the elements are four-noded linear elements, and the stress anal-
ysis is performed using seven different element divisions with different numbers of
elements: 16 (4 × 4), 64 (8 × 8), 256 (16 × 16), 1024 (32 × 32), 4096 (64 × 64),
16,384 (128 × 128), and 65,536 (256 × 256). Figure 5.2 shows a typical example of
step-by-step element division, where all the elements are divided equally into four at
each step and the material is assumed to be an isotropic elastic one. Figure 5.3 shows
where the stress values are evaluated. Figures 5.4 and 5.5 depict the stress values at
the point A and the point B in Fig. 5.3, respectively, where the horizontal axes are the
total numbers of elements in the meshes and the vertical axes the calculated stress
values. It can be seen from the figures that as the numbers of elements increase or
the element sizes decrease, there exists a tendency to converge to a certain value.
In other words, the accuracy of analysis results can be improved by dividing the
analysis domain into as many elements as possible.
There have been some theoretical studies on the accuracy of the finite element
method [4]. For example, for a one-dimensional problem spanning the interval [a, b],
the accuracy of the finite element solution is evaluated as follows [17]:
┌
|b
|∫ Σm ( i )
| du di u h 2
∥u − u h ∥m = √ − dx ≤ ch k+1−m (5.1.3)
dx i dx i
i=0
a
where u is the exact solution, h the element size, u h the finite element solution, c a
constant, k the degree of the basis function (polynomial), and 2m the order of the
differential equation to be solved. Note that, in the case of the ordinary linear stress
analysis, m = 1. It can be seen from the above equation that the smaller the element
size and the higher the order of the basis functions, the closer the finite element
solution is to the exact one.
5.2 Computation Time versus Element Size 141
5.2 Computation Time versus Element Size
As discussed in the previous section, the accuracy of the FEM solution is improved by
reducing the element size. In this section, the increase in the amount of computation
with reducing the element size is studied.
As the element size is reduced, the total number of elements as well as the compu-
tation time increases. From the viewpoint of computational load, the main processes
of the finite element method consist of
• Construction process of the global stiffness matrix [K ].
• Solving process of the set of linear equations with [K ] as the coefficient matrix
(Eq. (5.1.1)).
As for the former, the computational load required is proportional to the total
number of elements in the domain. Consider a two-dimensional rectangular region
is divided evenly by quadrilateral elements as shown in Fig. 5.2. If the length of
one side of the element is halved, the total number of elements together with the
computational load required to construct the global stiffness matrix is quadrupled. If
a hexahedron in the three-dimensional case is divided evenly into smaller hexahedral
elements, halving the length of one side of the element increases the total number
of elements by a factor of eight, and the computational load also does by a factor of
eight.
On the other hand, the computation time required to solve a set of linear equations
with [K ] as the coefficient matrix (Eq. (5.1.1)) also increases as the number of
elements does. Among the two methods to solve a set of linear equations, the direct
method and the iterative method, the number of unknowns is reduced sequentially
to obtain the solution in the former, while in the latter, the arbitrary initial solution
vector is successively updated to get closer to the correct solution vector [9].
Let us consider the amount of computation required to solve simultaneous linear
equations as follows:
[A]{x} = {b} (5.2.1)
16 64 256
Fig. 5.2 Mesh division

Fig. 5.3 Points A and B

selected for stress evaluation
(4,4)
A
(1.95, 3.95)
B
(1.95, 1.95)
y
x (0,0)
Fig. 5.4 Stresses versus 2.0

number of elements (point
A) 1.5
σx
1.0 σ
Stress
y
τ
xy
0.5
0.0
-0.5
10 100 1000 10000 100000
Number of Elements
Fig. 5.5 Stresses versus 1.2

number of elements (point B) 1.0
0.8
σ
0.6 x
σy
Stress
0.4 τ
xy
0.2
0.0
-0.2
-0.4
10 100 1000 10000 100000
Number of Elements
The Gaussian elimination method is known to be the most basic direct method
for solving a set of linear equations above, a pseudo-code for which is given as List
5.2.1.
List 5.2.1 Pseudo-code for Gaussian elimination
1 for(i=1;i<=n−1;i++){
2 for(j=i+1;j<=n;j++){
3 aa = A[j][i]/A[i][i];
4 b[j] = b[j] − aa*b[i];
5 for(k=i+1;k<=n;k++){
6 A[j][k] = A[j][k] − aa*A[i][k];
7 }
8 }
9}
10 b[n] = b[n]/b[n][n];
11 for(i=n-1;i>=1;i--){
12 for(j=i+1;j<=n;j++){
13 b[i] = b[i] - A[i][j]*b[j];
14 }
15 b[i] = b[i]/A[i][i];
16 }
In the code, n is the number of unknowns, A [][ ] is a two-dimensional array

representing the coefficient matrix, b [ ] is a one-dimensional array representing
the right-hand side vector, and the solution is stored in one-dimensional array b [ ].
The first through ninth lines are called the forward elimination process and the
tenth through sixteenth lines the backward substitution process. Based on the pseudo-
code above, the amount of computation required to solve the simultaneous linear
equations can be estimated. The most frequent operation in the forward elimination
process is that in the sixth line, the innermost of the triply nested loop. Since two
arithmetic operations are performed in the sixth line, and the loop is triply nested,
the total amount of arithmetic operations caused by the line is estimated as follows:
Σ
n−1 Σ
n Σ
n Σ
n−1 Σ
n Σ
n−1
1
2= 2(n − i ) = 2(n − i)2 = (n − 1)n(2n − 1)
i=1 j=i+1 k=i+1 i=1 j=i+1 i=1
3
(5.2.2)
Similarly, in the backward substitution process, the number of arithmetic

operations at the 13th line is the highest, which is estimated as
Σ
n−1 Σ
n Σ
n−1
2= 2(n − i ) = (n − 1)n (5.2.3)
i=1 j=i+1 i=1
As shown above, in the Gaussian elimination method, the computational load of

the forward elimination process increases in proportion to the cube of the number
of unknowns, while that of the backward substitution process to the square of the
number of unknowns, meaning that the total computational time required for the
entire solution process increases in proportion to the cube of the number of unknowns.
Note here that if a computing process consists of multiple subprocesses and each of
them requires computational load in proportion to different power of the number of
unknowns, the computational load of the subprocess with the highest power becomes
dominant as the number of unknowns increases.
Next, the computational load of the Gauss–Seidel method is discussed, which
is one of the basic iterative solution methods, where the coefficient matrix [A] is
decomposed into two matrices as shown in the following equation.
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
a11 a12 ··· a1n a11 0 ··· 0 0 −a12 · · · −a1n
⎢ a21 a22 ··· a2n ⎥ ⎢ a21 a22 ··· 0 ⎥ ⎢0 0 · · · −a2n ⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
[A] =⎢ . .. .. .. ⎥ = ⎢ .. .. .. .. ⎥−⎢. . .. .. ⎥
⎣ .. . . . ⎦ ⎣ . . . . ⎦ ⎣ .. .. . . ⎦
an1 an2 · · · ann an1 an2 · · · ann 0 0 ··· 0
= [N ] − [P] (5.2.4)
[N ]{x} = [P]{x} + {b} (5.2.5)
From this equation, a recurrence formula is obtained as follows:
[N ]{x}(r +1) = [P]{x}(r ) + {b} (5.2.6)
In the Gauss–Seidel method, we start with an appropriate initial vector {x}(0) , then
improve it successively by the above equation to converge to the correct solution
vector.
Let us consider the computational load per iteration of the equation. It is noted,
here, that the computational load of the matrix–vector product on the right-hand
side of the equation is proportional to the square of the number of unknowns. Since
the solution of the simultaneous linear equations with [N ] as the coefficient matrix
corresponds to the backward substitution process in the Gaussian elimination method,
the computational load is also proportional to the square of the number of unknowns.
Therefore, the computational load per iteration of the Gauss–Seidel method increases
in proportion to the square of the number of unknowns, and if the number of iterations
is of the same order as the number of unknowns, the overall computational load of
the method is proportional to the cube of the number of unknowns.
Thus, the computational load of each process of the finite element method is given
as
• Construction process of the global stiffness matrix [K ]: O(n) and
• Solving process( of)the set of linear equations with [K ] as the coefficient matrix
(Eq. (5.1.1)): O n 3 ,
where it is assumed that the number of nodes increases in proportion to that of

elements n. Note that for any function f (n) of n, e.g., f (n) = n and f (n) = n 3
in the above, the notation O( f (n)) is used to stand for computational load whose
amount is upper-bounded by f (n) multiplied by an arbitrary positive constant [13].
As discussed earlier,
( if) a computing process consists
( ) of two subprocesses, one is
O(n) and the other O n 3 , then the subprocess of O n 3 becomes dominant for large
value
( )of n, suggesting that the total computational load of the process is regarded as
O n3 .
Consider a two-dimensional rectangular region divided into quadrilateral
elements. If the length of one side of an element is halved, the total number of
elements increases by a factor of four, and the total number of nodes increases by the
same factor, so the computational complexity of the solution ( process
) of the resulting
simultaneous linear equations increases by a factor of 64 = 43 .
In the three-dimensional analysis, if the length of one side of an element is halved,
the total number of nodes increases by a factor of about 8,( and the ) computational
complexity increases dramatically by a factor of about 512 = 83 .
Thus, the computational complexity required to solve the set of linear equations
increases almost in proportion to the cube of the number of elements. In other words,
as the scale of analysis (number of elements) increases, the computational complexity
required to solve the set of linear equations becomes more serious than that required
for the global stiffness matrix construction process.
Note that the estimation above is based on the assumption that the coefficient
matrix is dense in which most of the components are nonzero. In practice, the global
stiffness matrix, which is the coefficient matrix of the simultaneous linear equations
in the finite element method, is usually sparse in which most of the components are
zero, suggesting that the amount of computation required for solution could be much
less.
Consider a two-dimensional stress analysis using a mesh of nine linear quadri-
lateral elements and 16 nodes as shown in Fig. 5.6a, then the global stiffness matrix
is obtained as shown in Fig. 5.6b. If the displacements at the ith node are{denoted }
as (Ui , Vi ), the global stiffness matrix is of 32 rows and 32 columns and U G in
Eq. (5.1.1) is written as follows:
⎛ ⎞
U1
⎜ ⎟
⎜ ⎟ V1
⎜ ⎟
⎜ ⎟ U2
{ G} ⎜ ⎟
U =⎜⎜ ⎟ V2 (5.2.7)
⎟ ..
⎜ ⎟
⎜ ⎟ .
⎜ ⎟
⎝ U16 ⎠
V16
13 14 15 16
9 10 11 12
5 6 7 8
1 2 3 4
(a) (b)
Fig. 5.6 Sparse global stiffness matrix obtained in the finite element method
The global stiffness matrix is usually symmetric, and nonzero components are
indicated by blue circles only in the upper triangular part above the diagonal of the
matrix as shown in Fig. 5.6b.
As shown in Fig. 5.7, the global stiffness matrix is usually sparse, where the
nonzero components are located in a banded region near the diagonal. The width B
of this band shown in the figure is called the maximum half-bandwidth.
For a sparse matrix of a banded structure such as the global stiffness matrix shown
above, the scope of the for-loop in the Gaussian elimination method (List 5.2.1) ( can)
be narrower, and the computational load in this case can be reduced to about O n B 2 .
Although the maximum half-bandwidth B increases with the number of unknowns
n, the computational load can be greatly reduced since B « n for most cases.
It is known that the nodal numbering of the FEM is arbitrary, and the maximum
half-bandwidth B of the global stiffness matrix depends on the nodal numbering [5,
11]. For example, Fig. 5.8a shows the same mesh as in Fig. 5.6a, but with different
node numbering, and Fig. 5.8b shows the global stiffness matrix for the mesh given
in Fig. 5.8a. It is noted that the maximum half-bandwidth of the matrix in Fig. 5.8b
is slightly larger than that in Fig. 5.6b. Since the maximum half-bandwidth varies
depending on the nodal numbering, affecting the computational load required for
solving corresponding simultaneous linear equations, the optimal nodal numbering
methods have been studied including the Cuthill–McKee (CM) method [7], the
reverse Cuthill–McKee (RCM) method [6, 14], and other methods [8, 12, 18].
It has been shown in the above that the computational load can be reduced using
the structure that the global stiffness matrix is banded with nonzero components only
in the banded regions near the diagonal.
Fig. 5.7 Banded matrix B
7 11 14 16
4 8 12 15
2 5 9 13
1 3 6 10
(a) (b)
Fig. 5.8 Another banded matrix with different node numbering
In addition, the memory usage can be reduced as well by storing only the compo-
nents in the band. As shown in Fig. 5.7, not only nonzero components but also zero
components exist in the banded region. Since the zero component does not affect
the results, further reduction of memory and computation time can be achieved by
storing only the nonzero components in memory. In this case, the reduction rate
increases with the size of problem or the number of nodes in the domain, meaning
that this storage method is particularly effective in large-scale analysis.
Various compact storage methods are available for sparse matrices including CRS
(Compressed Row Storage), CCS (Compressed Column Storage), and JDS (Jagged
Diagonal Storage) [19]. They store the nonzero components in a one-dimensional
real array and the position information (row, column) of each nonzero component in
a few integer arrays.
Although such a storage method can dramatically reduce the amount of memory
for large-scale analysis and the computation load as well, it should be noted that a
degradation in efficiency may occur due to non-contiguous memory access during
computation.
5.3 Error Estimation of Finite Element Solutions
In this section, two methods for estimating the error in the solution obtained by the
FEM are studied.
5.3.1 Error Estimation Based on Smoothing of Stresses
Various methods have been used to estimate the error from the correct or exact solu-
tion for the results of the FEM, which are known as the a posteriori error estimation
methods [1, 10, 20, 21].
In the FEM for solid mechanics, the displacement method is usually used, where
the displacements are unknown variables to be solved. In this method, the displace-
ments are continuous at the boundaries between elements, but the strains and stresses,
which are the first-order derivatives of the displacements, are discontinuous. From
the physical point of view, this is unacceptable and this inconvenience is considered
to be caused by the insufficient continuity of the basis functions.
In this regard, let us review the continuity of functions. Here, we introduce the
differentiability class C n , a measure of the continuity (smoothness) of a function,
meaning that a function belonging to C n is continuous up to the n-th derivative.
Let us consider, respectively, the continuities of the functions as
{
−x (x < 0)
f 1 (x) = (5.3.1)
x (x ≥ 0)
{
− 21 x 2 (x < 0)
f 2 (x) = (5.3.2)
x (x ≥ 0)
1 2
2
5.3 Error Estimation of Finite Element Solutions 149
{
− 16 x 3 (x < 0)
f 3 (x) = (5.3.3)
x (x ≥ 0)
1 3
6
Figure 5.9 depicts the function f 1 (x) and its first-order derivative, Fig. 5.10 the
function f 2 (x) and its first- and second-order derivatives, and Fig. 5.11 the function
f 3 (x) and its first-, second-, and third-order derivatives. It is clear from these graphs
that f 1 (x) belongs to C 0 , f 2 (x) to C 1 , and f 3 (x) to C 2 at x = 0, respectively. Note
that for x /= 0, all the three functions belong to C ∞ .
Since the basis functions, defined on an element-by-element basis, have usually
only C 0 continuity at element boundaries, the stresses, which are first-order deriva-
tives of the basis functions, are discontinuous at element boundaries. Figure 5.12
shows the schematic diagrams in one dimension.
Fig. 5.9 f 1 (x) belonging to 1.0

C 0 at x = 0
0.5
0.0
1
-0.5 1
-1.0
-1.0 -0.5 0.0 0.5 1.0
Fig. 5.10 f 2 (x) belonging 1.0

to C 1 at x = 0
0.5
0.0
2
2
-0.5
2
2
-1.0 2
-1.0 -0.5 0.0 0.5 1.0

Fig. 5.11 f 3 (x) belonging 1.0

to C 2 at x = 0
0.5
0.0
-0.5
-1.0
-1.0 -0.5 0.0 0.5 1.0
Fig. 5.12 Schematic view of

stress distributions
ei-1 ei ei+1
Linear Elements
ei-1 ei ei+1
Quadratic Elements
u (continuous) σ (discontinuous)
Let us have the discontinuous stresses above be smoothed so that they become
continuous at the boundaries between elements as the displacements. For this
purpose, we consider the stresses at an arbitrary position P(ξ, η) of a four-node
quadrilateral element e0 as shown in Fig. 5.13, which are given by
Fig. 5.13 Smoothing of stresses
⎛ ⎞
σx (ξ, η)
{σ (ξ, η)} = ⎝ σ y (ξ, η) ⎠ = [D][L][N (ξ, η)]{U } (5.3.4)
τx y (ξ, η)
where
⎛ ⎞
U1
⎜ V1 ⎟
⎜ ⎟
⎜ ⎟
{U } = ⎜ ... ⎟, (5.3.5)
⎜ ⎟
⎝ U4 ⎠
V4
[ ]
N1 (ξ, η) 0 N4 (ξ, η) 0
[N (ξ, η)] = ··· , (5.3.6)
0 N1 (ξ, η) 0 N4 (ξ, η)
⎡ ⎤
∂
0
⎢ ∂x ∂ ⎥
[L] = ⎣ 0 ∂ y ⎦, (5.3.7)
∂ ∂
∂y ∂x
Here, (Ui , Vi ) is the displacement at the i-th node and Ni (ξ, η) is the ith basis function.
[D] is the stress–strain matrix, which is given as follows [17]:
⎡ ⎤
1ν 0
E ⎣
[D] = ν 1 0 ⎦(PlaneStress) (5.3.8)
1 − ν2
0 0 1−ν
2
⎡ ⎤
ν
1 0
E(1 − ν) ⎢ ν 1−ν
⎥
[D] = ⎣ 1 0 ⎦(PlaneStrain) (5.3.9)
(1 + ν)(1 − 2ν) 1−ν 1−2ν
0 0 2(1−ν)
{ } modulus and ν the Poisson’s ratio.

where E is the Young’s
The stresses σ Pe01 at node P1 , one of the nodes of the element e0 , are obtained
using Eq. (5.3.4) as follows:
⎛ ⎞
{ σx (−1, −1)
e0 } { }
σ P1 = σ e0 (−1, −1) = {σ (−1, −1)} = ⎝ σ y (−1, −1) ⎠ (5.3.10)
τx y (−1, −1)
{ eThe} {nodal { P1} is shared by the elements e4 , e6 , and e7 , and the stresses
} point
σ P41 , σ Pe61 and σ Pe71 at the nodal point P1 are, respectively, calculated for each
element as
{ } { } { } { } { } { }
σ Pe41 = σ e4 (1, −1) , σ Pe61 = σ e6 (1, 1) , σ Pe71 = σ e7 (−1, 1) (5.3.11)
{ }
By taking the average of these stresses, the smoothed stress at the node P1 , σ PS1 ,
can be determined as follows:
{ e0 } { e4 } { e6 } { e7 }
{ S} σ P1 + σ P1 + σ P1 + σ P1
σ P1 = (5.3.12)
4
{ } { } { }
In the same manner, σ PS2 , σ PS3 , and σ PS4 are, respectively, obtained for the
nodes P2 , P3 , andP4 , and then, the smoothed stress at any position P(ξ, η) of the
element e0 is defined as
{ } Σ4
{ }
σ S (ξ, η) = Ni (ξ, η) σ PSi (5.3.13)
i=1
The smoothed stress is considered to be closer to the true one than the original
discontinuous one, and an a posteriori error estimation method based on this (called
ZZ method) has been proposed [24]. The error { of the FEM}solution is usually defined
as the difference between
{ FEMthe true }stresses σ TRUE
(ξ, η) and the stresses obtained
by the FEM analysis
{ TRUE } σ (ξ, η) , while it is almost
{ S impossible
} to obtain the true
stresses
{ σ (ξ,
} η) . On the other hand, as σ (ξ, η) is closer to the true stress
than σ FEM (ξ, η) , the difference between these stresses could represent the error.
The ZZ method is an a posteriori error estimation method based on this, which is
widely used as an error estimation method for the FEM because it is simple and
requires few modifications to analysis codes.
5.3.2 Error Estimation Using Solutions Obtained by Various

Meshes
As described in Sect. 5.1, the accuracy of the FEM solution is usually improved
by decreasing the element size. This is because the approximation accuracy of the
solution by the basis functions is improved, and the same effect can be obtained by
increasing the order of the basis functions. We discuss here the behavior of error in
the FEM solution when reducing the element size.
In the finite difference method (see Sect. 7.2), a method for improving the solution
based on the relationship between the grid spacing and the accuracy of the solution
is studied [23]. If the approximation
( ) accuracy by the finite difference method with
lattice spacing Δx is O Δx 2 , and the solutions with two different lattice spacing
Δx1 and Δx2 (Δx1 > Δx2 ) are φlΔx1 and φlΔx2 , respectively, then we can write with
φlTRUE being the true solution the following relationships.
φlTRUE − φlΔx1 ∝ (Δx1 )2 (5.3.14)
φlTRUE − φlΔx2 ∝ (Δx2 )2 (5.3.15)
It is known that, although not strictly correct, φlTRUE even more accurate than φlΔx2
is achieved assuming the following equality.
φlTRUE − φlΔx1 (Δx1 )2

= (5.3.16)
φlTRUE − φlΔx2 (Δx2 )2
This method is known as the Richardson extrapolation.

Developed also is a method to accurately predict the stress intensity factor at
the crack tip based on the relationship between the nodal density and the error [22],
where the true stress value is predicted by using the relation between the node density
and the stress obtained by the FEM as
σ FEM = σ TRUE + a N −δ (5.3.17)
where σ FEM is the stress value obtained by the FEM, σ TRUE the true stress value, a
and δ are coefficients, and N is the nodal density. By estimating the coefficients from
several analyses with different nodal densities, it has been possible to estimate σ TRUE .
This method is successfully applied to two- and three-dimensional crack analyses.
The ZZ method described in the previous section is a method for estimating
the error of a finite element solution by taking the difference between the original
solution and the better solution obtained by smoothing. A better solution can also
be obtained by reducing the element size, so it is possible to estimate the error of
a finite element solution by taking the difference between an analysis with a given
mesh and an analysis with a finer mesh. The use of meshes with multiple levels of
fineness will provide more detailed error information, which can be used to improve
the solution [22].
5.4 Improvement of Finite Element Solutions Using Error

Information and Deep Learning
In this section, we study the details of the method for improving the finite element
solutions using deep learning based on the error estimation (Sect. 5.3) [16].
In order to obtain an accurate solution with a small number of elements and nodes
by using a posteriori error estimation, a method called the adaptive finite element
method has been studied [2, 3, 15], which consists of three steps: an analysis with the
initial mesh is performed, then a posteriori error estimation is followed, and finally,
any part of analysis domain with relatively large error is remeshed to improve the
accuracy. This process is repeated until the error criterion is satisfied.
The adaptive finite element method is classified into three types: the h-adaptive
method, where the mesh is locally refined; the p-adaptive method, where the order of
the basis functions is locally increased; and the r-adaptive method, where the nodes
are locally relocated. Each method may be used alone or in combination.
The most commonly used method among these three types is the h-adaptive
method, where (1) analysis and a posteriori error estimation are performed on a mesh,
(2) the element subdivision is performed to refine the mesh for regions where the
error exceeds the criterion, and (1) and (2) are repeated until the error becomes suffi-
ciently small everywhere in the domain. Since the subdivision is locally performed,
the increase in the total number of nodes, which directly affects the analysis time,
may be suppressed. However, the repeated subdivision of mesh, even if only partially,
increases the total number of nodes, and the FEM analysis is repeated. Then, the total
computational load is not necessarily small.
In the adaptive FEM, the accuracy of the solution is improved by firstly refining
the mesh based on the error information and then performing analysis using the
refined mesh, where the error is employed to improve the solution not directly but
indirectly.
In contrast, here, a method that directly uses the error information to improve
the solution is presented, where “directly” means “without remeshing.” Specifically,
deep learning with a feedforward neural network is used to estimate the stresses
equivalent to those obtained with a sufficiently fine mesh directly from the stresses
obtained with a coarse mesh and its error information. For this purpose, the feedfor-
ward neural network is trained to output the nearly exact stresses at any point in the
analysis domain. The input data used are the stresses obtained with a coarse mesh
at the point of interest and its surrounding points as well as their error information.
This method is summarized in following three phases.
5.4 Improvement of Finite Element Solutions Using Error Information… 155
Data Preparation Phase: The FEM analyses with a coarse mesh under the various
analysis conditions of analysis domains, load conditions, fixation conditions, etc.,
are performed, and then error information is also obtained by an a posteriori error
estimation method on each of the above results. In addition, for each analysis condi-
tion, the FEM analysis with a very fine mesh is performed to obtain a solution close
to the true solution. Finally, a large number of data pairs are collected: each of
data pairs consists of the solution with a coarse mesh, its error information, and the
corresponding solution with a fine mesh.
Training Phase: A feedforward neural network is constructed by deep learning
using the data pairs collected in the Data Preparation Phase above as training patterns,
where input and teacher data for the neural network are set as follows:
• Input data: solution with a coarse mesh and its error information.
• Teacher data: solution with a fine mesh.
Application Phase: The FEM solution and its error information with a coarse
mesh for a problem to be solved are input to the trained neural network constructed
in the Training Phase; then a corresponding accurate solution that would be obtained
with a fine mesh is output from the neural network.
It is noted that, in this method, the input data for a feedforward neural network
include only the stress state and error information in the vicinity of the location
at which accurate stress is to be estimated, not including the analysis geometry or
boundary conditions, which may make the trained neural network applicable easily
to various analysis conditions.
In addition, since the input data include only the values obtained with a coarse
mesh, and also the inference by the trained neural network is fast, this method makes
it possible to estimate the accurate values of stresses at a specific point much faster
than the conventional FEM analysis with a fine mesh.
It may be a demerit of the present method that we can estimate the stresses not for
the whole region but only for a target point, although the values are accurate. However,
in most cases, it is sufficient to obtain the stress values at a specific important point or
area in the analysis domain. Then, the present method is considered to be a powerful
tool in such situations as optimal design, where repeated analyses are required.
The present method is categorized into Method-A and Method-B depending on
the techniques to get error information [16].
Figure 5.14a, b show, respectively, the flowchart of the standard finite element
analysis and that of Method-A. The latter is a method to get some error information
from differences between stresses obtained by the finite element analysis with a
coarse mesh and smoothed stresses and then estimate accurate stresses by deep
learning from the analysis results and the error information obtained with the coarse
mesh above.
On the other hand, Fig. 5.15a, b, respectively, show the flowchart of the standard
adaptive finite element analysis and that of Method-B. The latter is a method that
gets some error information from the difference of two sets of stresses: the stresses
obtained from the analysis with the initial mesh in standard adaptive FEM analysis
and those with a refined mesh generated at the first step of the adaptive remeshing.
While the adaptive FEM analysis repeats both adaptive remeshings and analyses with
the refined meshes, Method-B does not have any loop. In other words, Method-B is
a method for obtaining an accurate solution based on two relatively coarse meshes,
an initial mesh and its refined mesh, using deep learning.
5.5 Numerical Example
In this section, we study a numerical example of the method using smoothing stress
(Method-A), which is one of the methods for improving the solution of FEM analysis
using some error information and deep learning given in Sect. 5.4.
Here, we test Method-A (see Sect. 5.4) about its basic performance in a two-
dimensional stress analysis using four-node quadrilateral elements [16], where a
feedforward neural network is trained using stresses at a target point obtained by the
FEM with a fine mesh as teacher data, and stresses and some error information at
the point and its surrounding points with a coarse mesh as input data. The neural
network is trained to output accurate stresses at the target point when stresses and
the error information around the point obtained with a coarse mesh are input.
As mentioned above, not only the stresses at the target point where highly accurate
stresses are to be predicted, but also the stresses and some error information in its
neighborhood are used as auxiliary information of input data. For this purpose, as
shown in Fig. 5.16, stress evaluation points are arranged as a grid around the target
point, where such auxiliary information as the error information is generated.
Among many options for the arrangement of the points around the target point
as shown in Fig. 5.16, we employ here that with four neighborhood points shown
in Fig. 5.16a, where the stresses σxS , σ yS , and τxSy , and their smoothed ones σxC , σ yC ,
andτxCy with the FEM using a coarse mesh, and those σxF , σ yF , and τxFy using a fine
mesh are obtained. From these stresses calculated with a coarse mesh at the target
point (PT ) and four points (PN 1 , PN 2 , PN 3 , PN 4 ) around the point, input data for the
neural network (a) and (b) are generated as follows:
Input data (a): Based on the difference between the stresses and the smoothed
stresses obtained with a coarse mesh at the point of interest and four points around
the point, total of 15 (3 stress components × 5 points) values are generated as shown
in Table 5.1, which are considered to represent the distribution of errors in the vicinity
of the target point.
Input data (b): Based on the difference between the stresses at the target point and
those at each of four points around the point, total of 12 (3 stress components × 4
5.5 Numerical Example 157
Table 5.1 Input data:

σx σy τx y
Difference of stress values
PT PT σ S − PT σ C PT σ S − PT σ C PT τ S − PT τ C
between calculated and x x y y xy xy
smoothed ones PN 1 PN 1 σ S − PN 1 σ C PN 1 σ S − PN 1 σ C PN 1 τ S − PN 1 τ C
x x y y xy xy
PN 2 PN 2 σ S − PN 2 σ C PN 2 σ S − PN 2 σ C PN 2 τ S − PN 2 τ C
x x y y xy xy
x x y y xy xy
x x y y xy xy
Table 5.2 Input data:

σx σy τx y
Difference between
PN 1 PN 1 σ C − PT σxC PA1 σ C − PT σ C PA1 τ C − PT τ yC
calculated stresses at the x y y xy
target point and those at its PN 2 PN 2 σ C − PT σxC PN 2 σ C − PT σ C PN 2 τ C − PT τ yC
x y y xy
neighboring points
PN 3 PN 3 σ C − PT σxC PN 3 σ C − PT σ C PN 3 τ C − PT τ yC
x y y xy
PN 4 PN 4 σ C − PT σ C PN 4 σ C − PT σ C PN 4 τ C − PT τ yC
x x y y xy
points) values are generated as shown in Table 5.2, which are considered to represent
the local variation of stress in the vicinity of the target point.
Note that PT σxS means σxS at the point PT , etc., in the table and all the input data
are calculated from the results of the finite element analysis using a coarse mesh.
The difference between the stresses at the target point (PT ) obtained using a
coarse mesh and that obtained using a fine mesh, a total of three values (three stress
components × 1 point), are generated as the teacher data for the neural network. This
indicates that the trained neural network is not expected to have an extrapolation
capability to predict stresses with a fine mesh, but to have an interpolation capability
based on the mapping between the stresses with a coarse mesh and those with a fine
mesh.
Once the configuration of the input data and teacher data for the neural network
has been determined as described above, a large number of training patterns, each of
which representing different stress states around a target point, are to be generated
and collected.
Here, we take a two-dimensional stress analysis of a square area with various
boundary conditions as a platform, where a large number of training patterns, each
representing different stress state around a target point, are generated. The square
shape of the analysis domain is considered (side length 4 [m]), and the material is
assumed to be steel. The entire domain is evenly divided into four-node quadrilateral
elements, and two types of meshes with different levels of fineness are used: a coarse
mesh with 16 elements (4 × 4) and a fine mesh with 65,536 elements (256 × 256).
As for boundary conditions, the bottom surface is fixed, and an equally distributed
load of 1 [N/m] is applied to two edges selected from ten edges (numbered from 1 to 10
in Fig. 5.7, each of them is an edge of an element) of the coarse mesh with 16 elements.
The direction of the load θ is set to one of the following twelve values (Fig. 5.18):
{0, π/12, 2π/12, 3π/12, 4π/12, 5π/12, 6π/12, 7π/12, 8π/12, 9π/12,
10π/12, 11π/12}.( Figure
( 5.19
)) shows some samples of the boundary conditions.
10
There are 45 = choices of two edges, where distributed loads are
2
applied, and 144 (=12 × 12) choices of load directions, so the total number of
( ) of load boundary conditions is 6480 (=45 × 144). Note that the notation
choices
n
means the number of ways of choosing r objects out of n objects, ignoring the
r
order of choosing them.
For each of these 6480 boundary conditions, a two-dimensional linear stress anal-
ysis is performed using a coarse mesh (16 elements), and the stresses (σxC , σ yC , τxCy ) at
1600 (40 × 40) stress evaluation points evenly distributed in a grid (grid spacing 0.1)
within the domain are calculated. In addition, the smoothed stresses (σxS , σ yS , τxSy ) at
the stress evaluation points are calculated.
The smoothed stresses are calculated by the following procedure. First, the
smoothed stress at each node is calculated by simply averaging the stresses at the
node obtained for each element. Then, from the smoothed stresses at the node, the
smoothed stresses at the stress evaluation points in an element are obtained using the
shape functions of the element. (See Sect. 5.3.1).
Similarly, a two-dimensional linear stress analysis is performed using a fine mesh
(65,536 elements) under each boundary condition, and (σxF , σ yF , τxFy ) at the stress
evaluation points are calculated.
As a result, stresses (σxS , σ yS , τxSy ) and its smoothed ones (σxC , σ yC , τxCy ) with a
coarse mesh and stresses (σxF , σ yF , τxFy ) with a fine mesh are obtained at 1600 stress
evaluation points for each boundary condition.
It is noted that 1444 points excluding the outermost points out of the 1600 stress
evaluation points can be used as the target points in Fig. 5.16a. Since 1444 target
points are employed for each of the 6480 boundary conditions, a total of 9,357,120
(6480 × 1444) training patterns are collected.
Training and verification patterns used to train the feedforward neural network are
chosen at random from 9.36 million patterns collected in the previous section.
Here, four sets of training patterns are tested, each consisting of 300,000, 200,000,
100,000, and 50,000 patterns, while 100,000 patterns are selected for the verification.
The feedforward neural networks tested have 27 units in the input layer and 3
units in the output layer with the number of hidden layers ranged from 1 to 6, and
the number of units in each hidden layer is selected to be 20, 50, or 80. The number
of training epochs is set to 10,000.
The results of all the training conditions are shown in Fig. 5.20, where the hori-
zontal axis is the number of hidden layers, U20, U50, and U80 mean that the number
of units per hidden layer is 20, 50, and 80, respectively, and TP050, TP100, TP200,
and TP300 mean that the number of training patterns is 50,000, 100,000, 200,000, and
300,000, respectively. The vertical axis is the average error for 100,000 verification
patterns, defined as
1 Σ
NT P Σ
N OU
| i |
Err or = |O − T i | (5.5.1)
j j
N T P i=1 j=1
where NTP is the number of patterns for verification, NOU that of units in the output
layer of the neural network, O ij the output value of the j-th output unit for the i-th
verification pattern, and T ji the value of corresponding teacher signal.
This figure suggests that.
(a) When the number of units in the hidden layer is small, increasing the number
of intermediate layers may not reduce the error.
(b) When the number of units in the hidden layer is large, the error is reduced with
increasing the number of hidden layers.
(c) For any training condition (number of hidden layers, number of units per hidden
layer), increasing the number of training patterns reduces the error.
This figure also shows that the best result is given with 5 hidden layers, 80 units
per hidden layer, and 300,000 training patterns.
When the number of training epochs is extended to 30,000 for the neural network
trained above, the error for the verification pattern has decreased from 0.04586 at
10,000 training epochs to 0.04294 at 30,000 epochs.
The trained neural network constructed in Sect. 5.5.2 can be applied to various two-
dimensional stress analysis problems. Here, its performance is evaluated in detail for
the 100,000 verification patterns used in the Training Phase.
First, let us discuss the accuracy of the feedforward neural network with five
hidden layers and eighty units per hidden layer, which has been trained with 30,000
epochs and achieved the smallest error.
Figure 5.21 shows the distribution of the estimation error of σx for 100,000 patterns
for verification, where the vertical axis is the number of patterns and the horizontal
axis the error in estimated stress. This error is defined as the difference between the
estimated stress σxN N by the trained neural network and the stress σxF obtained with
the fine mesh, i.e., σxF − σxN N . Note that, for example, if the difference is in the range
of [0.01, 0.03], the median value of 0.02 is taken as the representative value, the
patterns with an error value of −0.98 or less have a representative value of −0.98,
and those with an error value of 0.99 or greater have a representative value of 1.00.
For comparison, the distribution of σxF − σxC , the difference between the stresses
obtained with the fine mesh and those with the coarse mesh, is shown as a dotted
line.
Figure 5.21 shows that the stress σxN N estimated by the trained neural network is
closer to the stress σxF obtained by the finite element analysis with the fine mesh than
the stress σxC obtained with the coarse mesh. Figures 5.22 and 5.23 show the similar
results for σ y and τx y , respectively. These figures show that the estimated stresses by
the trained neural network are close to accurate stresses obtained with the fine mesh.
Next, we study some examples in estimating stress distribution using the trained
neural network. Figures 5.24, 5.25 and 5.26 show the stress distributions along the
horizontal line (y = 3.85) near the top surface of the analysis domain shown in
Fig. 5.17 under some loading condition.
Figure 5.24 shows the results of σxN N (denoted as σx (DL) in the figure) estimated
by the trained neural network for 38 stress evaluation points on the corresponding
line. For comparison, the stresses σxF (σx (fine)) calculated with a fine mesh, the
stresses σxC (σx (coarse)) calculated with a coarse mesh, and their smoothed stresses
σxS (σx (smoothing)) are also shown in the figure, where the former stress repre-
sents teacher data and the latter two stresses input data for the neural network. It is
concluded that the estimated results by deep learning reproduce well the highly accu-
rate stress σxF calculated with a fine mesh, even though the estimation is only based
on stresses with a coarse mesh. Similarly, Figs. 5.25 and 5.26 show the estimated
results of σ y and τx y by the trained neural network, respectively. In both cases, as in
the case of σx , the highly accurate stresses are reproduced by the present method.
For more details, refer to the paper [16], which also provides an example of
Method-B (Fig. 5.14)
Mesh Mesh
Finite Element Analysis Finite Element Analysis
Results Stress Smoothing
Results Results
Deep Learning
Improved Results
(a) Conventional FEM (b) Method-A
Fig. 5.14 Flowchart of method-A: a conventional FEM, b Method-A

Initial Mesh Initial Mesh
Finite Element Analysis Finite Element Analysis
A Posteriori Error Analysis A Posteriori Error Analysis

Loop
Yes Adaptive Remeshing

Converged
No Finite Element Analysis
Adaptive Remeshing
Results Results
Results
Deep Learning
Improved Results
(a) Adaptive FEM (b) Method-B
Fig. 5.15 Flowchart of method-B: a adaptive FEM, b Method-B
Target point
Neighboring point
(a) (b) (c)
Fig. 5.16 Target point and its neighboring points

Fig. 5.17 Coarse mesh (16 4 5 6 7

elements)
3 8
2 9
1 10
Fig. 5.18 Direction of y

applied force
Fig. 5.19 Samples of boundary conditions

Fig. 5.20 Errors versus TP050U20

number of hidden layers for TP100U20
various numbers of training TP200U20
TP300U20
patterns and hidden units TP050U50
0.080 TP100U50
TP200U50
TP300U50
TP050U80
0.075 TP100U80
TP200U80
TP300U80
0.070
0.065
Error
0.060
0.055
0.050
0.045
1 2 3 4 5 6
Fig. 5.21 Error distribution: 105

σx σxF-σxC
4
10
σxF-σxNN
Number of Patterns
103
2
10
1
10
0
10
-1 -0.5 0 0.5 1
Error
5
σy
σyF-σyC
4
10 σyF-σyNN
Number of Patterns
3
10
102
1
10
100
-1 -0.5 0 0.5 1
Error

τx y σxyF-σxyC
4
10 σxyF-σxyNN
Number of Patterns
3
10
2
10
1
10
0
10
-1 -0.5 0 0.5 1
Error
Fig. 5.24 Estimated value 1.5 σx (coarse)

of σx at y = 3.85. Reprinted y=3.85
σx (smoothing)
from [16] with permission 1.0
σx (fine)
from Springer
0.5 σ (DL)
x
0.0
σx
-0.5
-1.0
-1.5
-2.0
-2.5
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0
x
References 165
Fig. 5.25 Estimated value σ (coarse)

2.0 y
of σ y at y = 3.85. Reprinted y=3.85 σy (smoothing)
from [16] with permission
1.5 σ (fine)
from Springer y
σ (DL)
y
1.0
σy
0.5
0.0
-0.5
-1.0
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0
x
Fig. 5.26 Estimated value 1.5 τxy (coarse)

of τx y at y = 3.85. Reprinted y=3.85
τxy (smoothing)
τxy (fine)
from Springer
τ (DL)
xy
1.0
xy
τ
0.5
0.0
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0
x
References
1. Ainsworth, M., Oden, J.T.: A posteriori error estimation in finite element analysis. Comput.
Methods Appl. Mech. Eng. 142, 1-88 (1997)
2. Babuska, I., Rheinboldt, W.C.: Error estimates for adaptive finite element computations. SIAM
J. Numer. Anal. 15, 736-754 (1978)
3. Babuska, I., Vogelius, M.: Feedback and adaptive finite element solution of one-dimensional
boundary value problems. Numer. Math. 44, 75-102 (1984)
4. Brenner, S.C., Scott, L.R.: The Mathematical Theory of Finite Element Methods, Springer
(1994)
5. Carey, G.F.: Computational Grids: Generation, Adaptation, and Solution Strategies. Taylor &
Francis (1994)
6. Cuthill, E.: Several Strategies for Reducing the Bandwidth of Matrices. In: Rose D.J.,
Willoughby R.A. (eds) Sparse Matrices and their Applications. The IBM Research Symposia
Series, Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-8675-3_14
7. Cuthill, E., McKee, J.: Reducing the bandwidth of sparse symmetric matrices. ACM ’69:
Proceedings of the 1969 24th national conference, Aug. 1969, pp. 157–172.
8. Gibbs, N.E., Poole, W.G. Jr., Stockmeyer, P.K.: An algorithm for reducing the bandwidth and
profile of a sparse matrix. SIAM J. Numer. Anal. 13, 236-250 (1976)
9. Golub, G.H., Van Loan, C.F.: Matrix Computations (Third Edition). The Johns Hopkins
University Press (1996)
10. Grätsch, T., Bathe, K.J.: A posteriori error estimation techniques in practical finite element
analysis. Comput. Struct. 83, 235-265 (2005)
11. Jennings, A., McKeown, J.J.: Matrix Computations (Second Edition). John Wiley & Sons
(1992)
12. King, I.P.: An automatic reordering scheme for simultaneous equations derived from network
systems. Int. J. Numer. Methods Eng. 2, 523-533 (1970)
13. Knuth, D.E.: Big omicron and big omega and big theta. SIGACT News 8(2), 18–24 (1976).
https://doi.org/10.1145/1008328.1008329
14. Liu, W.-H., Sherman, A.H.: Comparative analysis of the Cuthill-McKee and Reverse Cuthill-
McKee ordering algorithms for sparse matrix. SIAM J. Numer. Anal. 13, 198-213 (1976)
15. Murotani, K., Yagawa, G., Choi, J.B.: Adaptive finite elements using hierarchical mesh and
its application to crack propagation analysis. Comput. Methods Appl. Mech. Eng. 253, 1-14
(2013)
16. Oishi, A., Yagawa, G.: Finite elements using neural networks and a posteriori error. Arch.
Comput. Methods Eng. 28, 3433-3456 (2021). https://doi.org/10.1007/s11831-020-09507-0.
17. Reddy, J.N.: An Introduction to the Finite Element Method (Second Edition). McGraw-Hill
(1993)
18. Sloan, S.W.: An algorithm for profile and wavefront reduction of sparse matrices. Int. J. Numer.
Methods Eng. 23, 239-251 (1986)
19. Ueberhuber, C.W.: Numerical Computation 2. Springer (1997)
20. Verfürth, R.: A review of a posteriori error estimation and adaptive mesh refinement techniques.
Wiley-Teubner (1996)
21. Verfürth, R.: A Posteriori Error Estimation Techniques for Finite Element Methods. Oxford
University Press, Oxford (2013)
22. Yagawa, G., Ichimiya, M., Ando, Y.: Analysis method for stress intensity factors based on the
discretization error in the finite element method. Trans. JSME 44(379), 743-755 (1978). (in
Japanese).
23. Zienkiewicz, O.C., Morgan, K.: Finite Elements & Approximation. Dover (2006)
24. Zienkiewicz, O.C., Zhu, J.Z.: A simple error estimator and adaptive procedure for practical
engineering analysis. Int. J. Numer. Methods Eng. 24, 337-357 (1987)
Chapter 6
Contact Mechanics with Deep Learning
Abstract With the progress of computational mechanics, simulations of various

mechanical phenomena have been put to practical use. Simulation of contact and
collision between objects is one of them. In this chapter, we study an application
of deep learning to the contact search process, which is indispensable in contact
and collision analysis. In particular, we focus on the contact between two smooth
contact surfaces. In Sect. 6.1, the basics of the contact analysis and contact search are
discussed. In Sects. 6.2, 6.3, and 6.4, the NURBS basis functions used to represent
smooth contact surfaces, operations such as segmentation of NURBS-defined shapes,
and conventional surface-to-surface contact search methods are taken, respectively.
With these preparations, Sect. 6.5 formulates a contact search method using deep
learning, and finally, Sect. 6.6 shows a numerical example
6.1 Basics of Contact Mechanics
It is well known that the collision and contact analysis deals with contact phenomena
between multiple objects or between multiple locations of a single object [7–9, 21,
22]. In this section, the basic items of the contact analysis and contact search in the
finite element method are studied.
Considering the dynamic effect, the matrix equation of the FEM is written as
follows:
[M]{Ü } + [K ]{U } = {F} (6.1.1)
where the damping is not considered, [M] is the mass matrix, [K ] the global stiffness
matrix, {F} the load vector, and {Ü } and {U } the acceleration and displacement
vectors, respectively. Discretizing Eq. (6.1.1) in time, we have
( )
1 2 1
[M]{U }n+1 = {F}n − [K ] − [M] {U }n − [M]{U }n−1
(Δt) 2
(Δt) 2
(Δt)2
(6.1.2)
https://doi.org/10.1007/978-3-031-11847-0_6
168 6 Contact Mechanics with Deep Learning
where Δt is the time increment and {}n denotes a variable at time tn .

When [M] is assumed to be a lumped mass matrix, which is a diagonal one, we
are able to obtain the solution of the equation above without solving the simultaneous
linear equations. In this case, a stable solution is achieved, only when Δt satisfies
the Courant–Friedrichs–Lewy (CFL) condition as follows [1]:
l
Δt ≤ (6.1.3)
c
where l is the element length and c the velocity of the stress wave. Note that, if the
small element size is employed for the sake of achieving accurate result, a very small
value of Δt has to be adopted to satisfy the CFL condition, which may increase the
computational load.
In the dynamic explicit contact analysis, allowing a small penetration into the
other object, the repulsive force proportional to the penetration depth is calculated,
which is defined as the contact force, and it is essential to perform a contact search
to accurately identify the location and contact state of the collision or contact point,
where the contact state includes the penetration depth into the other object.
The procedure for the contact analysis based on the explicit dynamics is
summarized as follows:
1. Start the calculation for the first step (n = 1).
2. Identify the location of a contact point and the contact state by contact search.
3. Calculate the appropriate contact force {FC }n at the contact point.
4. Calculate {U }n+1 based on Eq. (6.1.2) with {FC }n added to the right-hand side
of the equation.
5. n → n + 1
6. Return to 1.
Contact search is usually performed in two stages: the global search and the
subsequent local search. In the former, a sub-region with a high possibility of contact
is searched from the entire region, while, in the latter, the positions of contact points
and contact states are identified at the region picked up in the global search. In the
former search, a hierarchical bounding box is used to improve the efficiency of the
search [3, 11]. Nevertheless, it is still difficult to improve the efficiency of the global
search, especially in distributed-memory parallel processing environments, because
the global contact search must cover multiple objects usually distributed among
processors [12]. As for the latter search, on the other hand, its computational load
and stability due to iterative solution process, may be issues.
For example, the node-segment type contact search algorithm, a typical method of
dynamic contact analysis in the FEM, consists of global and local search processes.
In the former process, for a node at one of the facing contact surfaces a segment (a
face of an element) on the other contact surface that is at the shortest distance to
the node is searched, and then, in the latter process, the exact location of the contact
point is identified for the pair of the node and the segment selected in the former
process [2, 5].
6.1 Basics of Contact Mechanics 169
Fig. 6.1 Node-to-segment

algorithm
P4
Ps
P1
H
P3
P2
Let’s consider the local contact search in the node-segment algorithm when a
contact surface consists of rectangular segments as shown in Fig. 6.1. This is to find
the local coordinates ξc and ηc of the foot H (x H , y H , z H ) of the perpendicular line
from the node PS (xS , yS , z S ) at one of the facing contact surfaces to the segment at
the opposite contact surface selected in the global search. Since H (x H , y H , z H ) is at
the segment, we can write using local coordinates as
⎛ ⎞ ⎛ ⎞
xH Σ4 Σ4 Xi
H (ξ, η) = ⎝ y H ⎠ = Ni (ξ, η)Pi = Ni (ξ, η)⎝ Yi ⎠ (6.1.4)
zH i=1 i=1 Zi
where Ni (ξ, η) are the first-order basis functions of a four-node quadrilateral element
of the finite element method and represented as
1
N1 (ξ, η) = (1 − ξ )(1 − η) (6.1.5)
4
1
N2 (ξ, η) = (1 + ξ )(1 − η) (6.1.6)
4
1
N3 (ξ, η) = (1 + ξ )(1 + η) (6.1.7)
4
1
N4 (ξ, η) = (1 − ξ )(1 + η) (6.1.8)
4
Since H (x H , y H , z H ) is the closest point to PS (xS , yS , z S ) at the segment, we have
(H (ξ, η) − PS )2 → min (6.1.9)
Based on Eq. (6.1.9), the following equations are derived.

∂ H (ξ, η)
(H (ξ, η) − PS ) = 0 (6.1.10)
∂ξ
∂ H (ξ, η)
(H (ξ, η) − PS ) = 0 (6.1.11)
∂η
Solving these two equations using Newton’s method, the local coordinates of the
contact point ξc and ηc are calculated, with which the signed distance g between
PS (xS , yS , z S ) and H (ξc , ηc ) is calculated as
−−→
g = H PS · n→ (6.1.12)
where n→ is the outward unit normal vector at H (ξc , ηc ). If g is less than or equal
to 0, the contact is judged to have occurred with the penetration depth |g|, and a
contact force proportional to |g| is added to the contact point H (ξc , ηc ), as well as
to PS (xS , yS , z S ) as a reaction force.
It is noted that the collision and contact analysis using the FEM method has some
problems caused by inaccurate definition in geometry of contact surfaces (Fig. 6.2).
In other words, the majority of the basis functions of the FEM have C 0 continuity
only at the element boundary (see Sect. 5.3.1), causing that the repulsive contact
force added when contact is detected changes its direction discontinuously at the
element boundary, which deteriorates the stability and convergence of the analysis
[10, 16, 20].
The surface irregularities caused by the basis functions could be eliminated by
using smooth basis functions. For example, if NURBS [15, 17] is used as the basis
function of the analysis to represent a smooth surface of bodies, the contact analysis
could be performed with a smooth contact surface as shown in Fig. 6.3. For this reason
and other benefits, the isogeometric analysis using NURBS as the basis function
of analysis [4, 6] and the NURBS-Enhanced FEM (NEFEM) [18, 19] have been
Fig. 6.2 Contact surface

defined using conventional
finite element basis functions
6.2 NURBS Basis Functions 171
Fig. 6.3 Contact surface

defined using NURBS basis
functions
developed with the advantage that the smooth shapes defined in CAD are assured
during the contact analysis.
6.2 NURBS Basis Functions
As studied in the previous section, the B-spline and the NURBS basis functions can
be used to represent smooth surfaces. We discuss here these smooth basis functions
in some details.
The NURBS basis functions for computer-aided design (CAD) system to represent
shapes of object are derived from the one-dimensional B-spline basis functions. These
functions of the pth-order Ni, p (ξ ) are defined as [15, 17]
{
1 (ξi ≤ ξ < ξi+1 )
Ni,0 (ξ ) = (6.2.1)
0 (otherwise)
ξ − ξi ξi+ p+1 − ξ
Ni, p (ξ ) = Ni, p−1 (ξ ) + Ni+1, p−1 (ξ ) (6.2.2)
ξi+ p − ξi ξi+ p+1 − ξi+1
{ }
where a knot vector Ξ = ξ1 , ξ2 , ξ3 , . . . , ξn+ p , ξn+ p+1 is a sequence of monotoni-
cally non-decreasing real numbers, and the rule 0/0 = 0 is applied to the fractional
part of Eq. (6.2.2).
As an example, let’s take the process of constructing five second-order B-spline
basis functions from the knot vector {0, 0, 0, 1, 2, 3, 3, 3}. First, from Eq. (6.2.1),
B-spline functions of the 0-th order from N1,0 to N7,0 are, respectively, given as
follows:
{
1 (0 ≤ ξ < 0)
N1,0 (ξ ) = =0 (6.2.3a)
0 (otherwise)
{
1 (0 ≤ ξ < 0)
N2,0 (ξ ) = =0 (6.2.3b)
0 (otherwise)
{
1 (0 ≤ ξ < 1)
N3,0 (ξ ) = (6.2.3c)
0 (otherwise)
{
1 (1 ≤ ξ < 2)
N4,0 (ξ ) = (6.2.3d)
0 (otherwise)
{
1 (2 ≤ ξ < 3)
N5,0 (ξ ) = (6.2.3e)
0 (otherwise)
{
1 (3 ≤ ξ < 3)
N6,0 (ξ ) = =0 (6.2.3f)
0 (otherwise)
{
1 (3 ≤ ξ < 3)
N7,0 (ξ ) = =0 (6.2.3g)
0 (otherwise)
Next, B-spline functions of the first order are, respectively, constructed from the
B-spline functions of the 0th order by Eq. (6.2.2) as
ξ − ξ1 ξ3 − ξ ξ −0 0−ξ
N1,1 (ξ ) = N1,0 (ξ ) + N2,0 (ξ ) = 0+ 0 = 0 (6.2.4a)
ξ2 − ξ1 ξ3 − ξ2 0−0 0−0
ξ − ξ2 ξ4 − ξ
N2,1 (ξ ) = N2,0 (ξ ) + N3,0 (ξ )
ξ3 − ξ2 ξ4 − ξ3
{
ξ −0 1 − ξ 1 (0 ≤ ξ < 1)
= 0+
0−0 1 − 0 0 (otherwise)
{
1 − ξ (0 ≤ ξ < 1)
= (6.2.4b)
0 (otherwise)
ξ − ξ3 ξ5 − ξ
N3,1 (ξ ) = N3,0 (ξ ) + N4,0 (ξ )
ξ4 − ξ3 ξ5 − ξ4
{ {
ξ − 0 1 (0 ≤ ξ < 1) 2 − ξ 1 (1 ≤ ξ < 2)
= +
1 − 0 0 (otherwise) 2 − 1 0 (otherwise)
⎧
⎨ ξ (0 ≤ ξ < 1)
= 2 − ξ (1 ≤ ξ < 2) (6.2.4c)
⎩
0 (otherwise)
ξ − ξ4 ξ6 − ξ
N4,1 (ξ ) = N4,0 (ξ ) + N5,0 (ξ )
ξ5 − ξ4 ξ6 − ξ5
{ {
ξ − 1 1 (1 ≤ ξ < 2) 3 − ξ 1 (2 ≤ ξ < 3)
= +
2 − 1 0 (otherwise) 3 − 2 0 (otherwise)
⎧
⎨ ξ − 1 (1 ≤ ξ < 2)
= 3 − ξ (2 ≤ ξ < 3) (6.2.4d)
⎩
0 (otherwise)
ξ − ξ5 ξ7 − ξ
N5,1 (ξ ) = N5,0 (ξ ) + N6,0 (ξ )
ξ6 − ξ5 ξ7 − ξ6
{
ξ − 2 1 (2 ≤ ξ < 3) 3−ξ
= + 0
3 − 2 0 (otherwise) 3−3
{
ξ − 2 (2 ≤ ξ < 3)
= (6.2.4e)
0 (otherwise)
ξ − ξ6 ξ8 − ξ ξ −3 3−ξ
N6,1 (ξ ) = N6,0 (ξ ) + N7,0 (ξ ) = 0+ 0 = 0 (6.2.4f)
ξ7 − ξ6 ξ8 − ξ7 3−3 3−3
Finally, B-spline functions of the second order are, respectively, given from the
B-spline functions of the first order by Eq. (6.2.2) as follows:
ξ − ξ1 ξ4 − ξ
N1,2 (ξ ) = N1,1 (ξ ) + N2,1 (ξ )
ξ3 − ξ1 ξ4 − ξ2
{
ξ −0 1 − ξ 1 − ξ (0 ≤ ξ < 1)
= 0+
0−0 1−0 0 (otherwise)
{
(1 − ξ ) (0 ≤ ξ < 1)
2
= (6.2.5a)
0 (otherwise)
ξ − ξ2 ξ5 − ξ
N2,2 (ξ ) = N2,1 (ξ ) + N3,1 (ξ )
ξ4 − ξ2 ξ5 − ξ3
⎧
{ ξ (0 ≤ ξ < 1)
ξ − 0 1 − ξ (0 ≤ ξ < 1) 2−ξ⎨
= + 2 − ξ (1 ≤ ξ < 2)
1−0 0 (otherwise) 2 − 0⎩
0 (otherwise)
⎧
⎨ ξ (1 − ξ ) + 2 (2 − ξ )ξ (0 ≤ ξ < 1)
1
= 2 (2
1
− ξ )2 (1 ≤ ξ < 2) (6.2.5b)
⎩
0 (otherwise)
ξ − ξ3 ξ6 − ξ
N3,2 (ξ ) = N3,1 (ξ ) + N4,1 (ξ )
ξ5 − ξ3 ξ6 − ξ4
⎧ ⎧
ξ (0 ≤ ξ < 1) ξ − 1 (1 ≤ ξ < 2)
ξ − 0⎨ 3−ξ⎨
= 2 − ξ (1 ≤ ξ < 2) + 3 − ξ (2 ≤ ξ < 3)
2 − 0⎩ 3 − 1⎩
0 (otherwise) 0 (otherwise)
⎧
⎨ ξ
1 2
2 (0 ≤ ξ < 1)
= ξ
2 (2
1
− ξ ) + 21 (3 − ξ )(ξ − 1) (1 ≤ ξ < 2) (6.2.5c)
⎩
2 (3
1
− ξ )2 (2 ≤ ξ < 3)
ξ − ξ4 ξ7 − ξ
N4,2 (ξ ) = N4,1 (ξ ) + N5,1 (ξ )
ξ6 − ξ4 ξ7 − ξ5
⎧
ξ − 1 (1 ≤ ξ < 2) {
ξ − 1⎨ 3 − ξ ξ − 2 (2 ≤ ξ < 3)
= 3 − ξ (2 ≤ ξ < 3) +
3 − 1⎩ 3−2 0 (otherwise)
0 (otherwise)
⎧
⎨ 2 (ξ
1
− 1)2 (1 ≤ ξ < 2)
= 2 (ξ − 1)(3 − ξ ) + (3 − ξ )(ξ − 2) (2 ≤ ξ < 3)
1
(6.2.5d)
⎩
0 (otherwise)
ξ − ξ5 ξ8 − ξ
N5,2 (ξ ) = N5,1 (ξ ) + N6,1 (ξ )
ξ7 − ξ5 ξ8 − ξ6
{
ξ − 2 ξ − 2 (2 ≤ ξ < 3) 3−ξ
= + 0
3−2 0 (otherwise) 3−3
{
(ξ − 2)2 (2 ≤ ξ < 3)
= (6.2.5e)
0 (otherwise)
Let’s take N3,2 (ξ ) as an example. As shown in Eq. (6.2.5c), the function is defined
by different expressions for each interval. If the equations for each interval are
denoted by f 1 (ξ ), f 2 (ξ ), and f 3 (ξ ), respectively, we have
⎧ ⎧
⎨ f 1 (ξ ) (0 ≤ ξ < 1) ξ (0 ≤ ξ < 1)
1 2
⎨ 2
N3,2 (ξ ) = f 2 (ξ ) (1 ≤ ξ < 2) = 2 ξ (2 − ξ ) + 21 (3 − ξ )(ξ − 1) (1 ≤ ξ < 2)
1
⎩ ⎩
f 3 (ξ ) (2 ≤ ξ < 3) 2 (3
1
− ξ )2 (2 ≤ ξ < 3)
(6.2.6)
Then, we have the following equations,
1 1
f 1 (1) = f 2 (1) = , f 2 (2) = f 3 (2) = (6.2.7)
2 2
| | | |
d f 1 || d f 2 || d f 2 || d f 3 ||
= = 1, = = −1 (6.2.8)
dξ |ξ =1 dξ |ξ =1 dξ |ξ =2 dξ |ξ =2
| | | |
d 2 f 1 || d 2 f 2 || d 2 f 2 || d 2 f 3 ||
/= , /= (6.2.9)
dξ 2 |ξ =1 dξ 2 |ξ =1 dξ 2 |ξ =2 dξ 2 |ξ =2
These show that N3,2 (ξ ) is a smooth function, indicating C 1 continuity at knot

values at the boundary between intervals (see Sect. 5.3.1).
Fig. 6.4 a B-spline basis a

functions of the first order 1.2
constructed from the knot
vector {0, 0, 1, 2, 3, 4, 5, 5}. 1.0
b B-spline basis functions of
the second order constructed
0.8
from the knot vector
{0, 0, 0, 1, 2, 3, 4, 5, 5, 5}
0.6
0.4
0.2
0.0
0 1 2 3 4 5
ξ
b
1.2
1.0
0.8
0.6
0.4
0.2
0.0
0 1 2 3 4 5
ξ
Figure 6.4a, b, respectively, show six B-spline basis functions of the first order
constructed from the knot vector {0, 0, 1, 2, 3, 4, 5, 5}, and seven B-spline basis
functions of the second order constructed from {0, 0, 0, 1, 2, 3, 4, 5, 5, 5}.
It is known that a knot vector is allowed to repeat the same knot values, and the
standard knot vector in CAD is an “open knot vector” where the first and last knot
values of that are repeated p + 1 (i.e., the order of the basis function +1) times, and
the pth order B-spline basis functions are C p−k continuous at k-times repeated knot
values.
Let’s discuss some graphs as follows:
Figure 6.5a: eight cubic B-spline basis functions constructed from
{0, 0, 0, 0, 1, 2, 3, 4, 5, 5, 5, 5},
Fig. 6.5 a B-spline basis a

functions of the third order 1.2
vector 1.0
{0, 0, 0, 0, 1, 2, 3, 4, 5, 5, 5, 5}.
b B-spline basis functions of 0.8
the third order constructed
from the knot vector 0.6
{0, 0, 0, 0, 1, 2, 3, 3, 4, 5, 5, 5, 5}.
c B-spline basis functions of 0.4
the third order constructed
{0, 0, 0, 0, 1, 2, 3, 3, 3, 4, 5, 5, 5, 5}.
d B-spline basis functions of 0.0
the third order constructed 0 1 2 3 4 5
from the knot vector ξ
{0, 0, 0, 0, 1, 2, 3, 3, 3, 3, 4, 5, 5, 5, 5}
b
1.2
1.0
0.8
0.6
0.4
0.2
0.0
0 1 2 3 4 5
ξ
c
1.2
1.0
0.8
0.6
0.4
0.2
0.0
0 1 2 3 4 5
ξ
Fig. 6.5 (continued) d

1.2
1.0
0.8
0.6
0.4
0.2
0.0
0 1 2 3 4 5
ξ
Figure 6.5b: nine cubic B-spline basis functions constructed from

{0, 0, 0, 0, 1, 2, 3, 3, 4, 5, 5, 5, 5},
Figure 6.5c: ten cubic B-spline basis functions constructed from
{0, 0, 0, 0, 1, 2, 3, 3, 3, 4, 5, 5, 5, 5},
Figure 6.5d: eleven cubic B-spline basis functions constructed from {0, 0, 0, 0, 1,
2, 3, 3, 3, 3, 4, 5, 5, 5, 5}.
The graphs shown in Fig. 6.5a, b are C 2 and C 1 continuous and smooth at the
knot value 3, respectively, while that in Fig. 6.5c is C 0 continuous at the knot value,
which means continuous but not smooth, and that in Fig. 6.5d is C −1 continuous at
the knot value, meaning that the separation of the graph occurs at the point.
Note that the B-spline basis functions are nonnegative with the property that the
sum being 1, which is called the partition of unity given as follows:
Ni, p (ξ ) ≥ 0 (for arbitrary ξ ) (6.2.10)
Σ
n
Ni, p (ξ ) = 1 (for arbitrary ξ ) (6.2.11)
i=1
p,q
The two-dimensional B-spline basis functions Ni, j (ξ, η) and the three-
p,q,r
dimensional B-spline basis functions Ni, j,k (ξ, η, ζ ) are, respectively, defined as
the product of the one-dimensional B-spline basis functions in each axis as follows:
p,q
Ni, j (ξ, η) = Ni, p (ξ ) · M j,q (η) (6.2.12)
p,q,r
Ni, j,k (ξ, η, ζ ) = Ni, p (ξ ) · M j,q (η) · L k,r (ζ ) (6.2.13)
where p, q, and r are the orders in the ξ, η, and ζ -axis, respectively.

The one-dimensional NURBS basis function of the pth order, Ri, p (ξ ), is defined
by adding a new parameter to the one-dimensional B-spline basis function of the pth
order Ni, p (ξ ) as
Ni, p (ξ ) · wi
Ri, p (ξ ) = Σn (6.2.14)
N (ξ ) · wî
î=1 î, p
where the new parameter w = {w1 , w2 , . . . , wn }(wi > 0) is called the weight. It is
clear from Eq. (6.2.14) that the NURBS basis functions coincide with the B-spline
basis functions when all the weights are equal. In other words, the B-spline basis
functions are included in the NURBS basis functions, and the NURBS basis functions
have the same properties as the B-spline basis functions as follows:
Ri, p (ξ ) ≥ 0(for arbitrary ξ ) (6.2.15)
Σ
n
Ri, p (ξ ) = 1(for arbitrary ξ ) (6.2.16)
i=1
Figure Fig. 6.6a–e shows the six quadratic NURBS basis functions constructed
from the knot vector {0, 0, 0, 1, 2, 3, 4, 5, 5, 5}, where Fig. 6.6a shows
the basis functions with w = {1, 1, 1, 1, 1/5, 1, 1}, Fig. 6.6b those with w =
{1, 1, 1, 1, 1/2, 1, 1}, Fig. 6.6c those with w = {1, 1, 1, 1, 1, 1, 1}, Fig. 6.6d those
with w = {1, 1, 1, 1, 2, 1, 1}, and Fig. 6.6e those with w = {1, 1, 1, 1, 5, 1, 1},
respectively. It can be seen from these graphs, that, by changing the value of w5 , not
only N5,2 (ξ ) but also N3,2 (ξ ), N4,2 (ξ ), and N6,2 (ξ ) are affected due to the nature of
the partition of unity (Eq. (6.2.16)), while N1,2 (ξ ) and N2,2 (ξ ) are not affected at all.
p,q
The two-dimensional NURBS basis functions Ri, j (ξ, η) and the three-
p,q,r
dimensional NURBS basis functions Ri, j,k (ξ, η, ζ ) are also defined using the one-
dimensional B-spline basis functions Ni, p (ξ ), M j,q (η), L k,r (ζ ) and weights in each
axis, respectively, as
p,q Ni, p (ξ ) · M j,q (η) · wi, j

Ri, j (ξ, η) = Σn Σm (6.2.17)
î=1
N (ξ ) · M ĵ,q (η) · wî, ĵ
ĵ=1 î, p
p,q,r Ni, p (ξ ) · M j,q (η) · L k,r (ζ ) · wi, j,k

Ri, j,k (ξ, η, ζ ) = Σn Σm Σl (6.2.18)
î=1 ĵ=1
N (ξ ) · M ĵ,q (η) · L k̂,r (ζ ) · wî, ĵ,k̂
k̂=1 î, p
Fig. 6.6 a NURBS basis a

functions of the second order 1.2
vector 1.0
{0, 0, 0, 1, 2, 3, 4, 5, 5, 5}
with { } 0.8
w = 1, 1, 1, 1, 15 , 1, 1 . b
NURBS basis functions of
0.6
0.4
{0, 0, 0, 1, 2, 3, 4, 5, 5, 5}
with { }
w = 1, 1, 1, 1, 21 , 1, 1 . c 0.2
NURBS basis functions of
the second order constructed 0.0
from the knot vector 0.0 1.0 2.0 3.0 4.0 5.0
{0, 0, 0, 1, 2, 3, 4, 5, 5, 5} ξ
with w = {1, 1, 1, 1, 1, 1, 1}. b
d NURBS basis functions of 1.2
{0, 0, 0, 1, 2, 3, 4, 5, 5, 5}
with w = {1, 1, 1, 1, 2, 1, 1}. 0.8
e NURBS basis functions of
the second order constructed 0.6
{0, 0, 0, 1, 2, 3, 4, 5, 5, 5} 0.4
with w = {1, 1, 1, 1, 5, 1, 1}
0.2
0.0
0.0 1.0 2.0 3.0 4.0 5.0
ξ
c
1.2
1.0
0.8
0.6
0.4
0.2
0.0
0.0 1.0 2.0 3.0 4.0 5.0
ξ
Fig. 6.6 (continued) d

1.2
1.0
0.8
0.6
0.4
0.2
0.0
0.0 1.0 2.0 3.0 4.0 5.0
ξ
e
1.2
1.0
0.8
0.6
0.4
0.2
0.0
0.0 1.0 2.0 3.0 4.0 5.0
ξ
6.3 NURBS Objects Based on NURBS Basis Functions
Using the NURBS basis functions described in Sect. 6.2, the present section deals
with the methods to represent three-dimensional object shapes with smooth surfaces,
showing how to edit the basis functions while preserving the shape and how to split
the shape.
Using the NURBS basis functions and the control points Bi (or Bi, j , Bi, j,k ), a
curve C(ξ ), a surface S(ξ, η), and a solid V (ξ, η, ζ ) in the three-dimensional space
are, respectively, defined as follows:
⎛ ⎞ ⎛ ⎞
x(ξ ) Σ
n Xi Σ
n
(ξ ) = ⎝ y(ξ ) ⎠ = Ri (ξ ) · ⎝ Yi ⎠ =
p p
Ri (ξ ) · Bi (6.3.1)
z(ξ ) i Zi i
6.3 NURBS Objects Based on NURBS Basis Functions 181
⎛ ⎞ ⎛ ⎞
x(ξ, η) Σn,m X i, j Σ
n,m
S(ξ, η) = ⎝ y(ξ, η) ⎠ = Ri, j (ξ, η) · ⎝ Yi, j ⎠ =
p,q p,q
Ri, j (ξ, η) · Bi, j
z(ξ, η) i, j Z i, j i, j
(6.3.2)
⎛ ⎞ ⎛ ⎞
x(ξ, η, ζ ) Σ p,q,r
n,m,l X i, j,k
V (ξ, η, ζ ) = ⎝ y(ξ, η, ζ ) ⎠ = Ri, j,k (ξ, η, ζ ) · ⎝ Yi, j,k ⎠
z(ξ, η, ζ ) i, j,k Z i, j,k
Σ
n,m,l
p,q,r
= Ri, j,k (ξ, η, ζ ) · Bi, j,k (6.3.3)
i, j,k
Let C1 (ξ ) be the line-segment defined by six B-spline basis functions of the

first order (NURBS basis functions of the first order with all the weights set to the
same unique value) constructed from the knot vector {0, 0, 1, 2, 3, 4, 5, 5} and
six control points P1 (0, 0, 0), . . . , and P6 (5, 0, 0) evenly distributed on the x-axis.
Similarly, let C2 (ξ ) be the line-segment defined by seven B-spline basis functions
of the second order constructed from the knot vector {0, 0, 0, 1, 2, 3, 4, 5, 5, 5} and
seven control points evenly distributed between (0, 0, 0) and (5, 0, 0), C3 (ξ ) that
defined by eight B-spline basis functions of the third order constructed from the
knot vector {0, 0, 0, 0, 1, 2, 3, 4, 5, 5, 5, 5} and eight control points evenly distributed
between (0, 0, 0) and (5, 0, 0). Then, as shown in Fig. 6.7, the shapes of these three
line-segments are identical, but their internal representations are not. Figure 6.8
shows the relationship between a parameter ξ and the coordinate C(ξ ) of a point on
the line corresponding to the parameter, where the horizontal axis is the parameter
ξ , and the vertical axis C(ξ ). C1 (ξ ), C2 (ξ ) and C3 (ξ ) coincide at ξ = 0 and ξ = 5,
indicating that they are line-segments with endpoints at (0, 0, 0) and (5, 0, 0). On the
other hand, C1 (ξ ) varies linearly with respect to the parameter ξ , while C2 (ξ ) and
C3 (ξ ) nonlinearly.
Let P1,1 (0, 0, 0), . . . , and P8,8 (7, 7, 0) be 64 control points evenly located
in a grid on xy-plane given in Table 6.1, and Ni,3 (ξ ) and M j,3 (η) eight B-
spline basis functions of the third order constructed from the knot vector
{0, 0, 0, 0, 1, 2, 3, 4, 5, 5, 5, 5}, respectively. Then a plane S1 (ξ, η) in the three-
dimensional space can be defined with these control points and basis functions as
follows:
Fig. 6.7 Line segments

constructed from NURBS
basis functions of different
orders
Fig. 6.8 C(ξ ) versus ξ for 5.0

line segments of the same
shape
4.0
C(ξ)
3.0
2.0 C
1
C
2
1.0 C
3
0.0
0.0 1.0 2.0 3.0 4.0 5.0
ξ
⎛ ⎞
x(ξ, η) Σ8 Σ 8
S1 (ξ, η) = ⎝ y(ξ, η) ⎠ = Ni,3 (ξ ) · M j,3 (η) · Pi, j (6.3.4)
z(ξ, η) i=1 j=1
Figure 6.9a, b, respectively, show the locations of the control points and the
generated surface (object). When the NURBS (B-spline) basis functions are used to
generate objects such as lines, surfaces, and solids, the control points may be located
outside the object. It can be derived from Eq. (6.3.2) that a control point Bα,β is a
p,q
part of the object only if Rα,β (ξ0 , η0 ) is 1 for some ξ0 , η0 . This is because all the
other basis functions are 0 from Eqs. (6.2.15) and (6.2.16) in the above case, which
results in S(ξ0 , η0 ) = Bα,β from Eq. (6.3.2).
However, as can be seen from Fig. 6.5a, the maximum values of the majority of
the basis functions are less than 1, then the control points are considered to be apart
from the object. Note here that one of the basis functions becomes 1 for the knot
values at both ends of the open knot vector, indicating each of the four end points at
the corners of the quadrilateral shape in Fig. 6.9c corresponds to one of the control
points, respectively.
Table 6.1 64(=8 × 8) control points

P1,1 (0, 0, 0) (0, 1, 0) (0, 2, 0) (0, 3, 0) (0, 4, 0) (0, 5, 0) (0, 6, 0) P1,8 (0, 7, 0)
(1, 0, 0) (1, 1, 0) (1, 2, 0) (1, 3, 0) (1, 4, 0) (1, 5, 0) (1, 6, 0) (1, 7, 0)
(2, 0, 0) (2, 1, 0) (2, 2, 0) (2, 3, 0) (2, 4, 0) (2, 5, 0) (2, 6, 0) (2, 7, 0)
(3, 0, 0) (3, 1, 0) (3, 2, 0) (3, 3, 2) (3, 4, 2) (3, 5, 0) (3, 6, 0) (3, 7, 0)
(4, 0, 0) (4, 1, 0) (4, 2, 0) (4, 3, 2) (4, 4, 2) (4, 5, 0) (4, 6, 0) (4, 7, 0)
(5, 0, 0) (5, 1, 0) (5, 2, 0) (5, 3, 0) (5, 4, 0) (5, 5, 0) (5, 6, 0) (5, 7, 0)
(6, 0, 0) (6, 1, 0) (6, 2, 0) (6, 3, 0) (6, 4, 0) (6, 5, 0) (6, 6, 0) (6, 7, 0)
P8,1 (7, 0, 0) (7, 1, 0) (7, 2, 0) (7, 3, 0) (7, 4, 0) (7, 5, 0) (7, 6, 0) P8,8 (7, 7, 0)
Fig. 6.9 a Control points. b Control Point

Object (curved surface) . c a
Object and control points
knot lines
c
Object (Surface) Control Point
d
Figure 6.9d shows the set of points (called knot line) where the knot values are
equal. We can see from the figure that, due to the nonlinearity shown in Fig. 6.8, the
space between the knot lines is wider for the knot values closer to the ends of the
knot vector, even though the control points are almost equally spaced. As discussed
below, an object defined by the NURBS (B-spline) basis functions can be divided
without overlap along a knot line by adding control points.
Let’s study how to divide an object defined by the NURBS (B-spline) basis func-
tions. Firstly, it is possible to add control points to the object defined by the NURBS
(B-spline) basis functions without changing its shape. This is done by inserting new
knot values into the knot vector for{each axis as follows. }
Assuming a knot vector Ξ = ξ1 , ξ2 , ξ3 , . . . , ξn+ p−1 , ξn+ p , ξn+ p+1 for gener-
ating the pth order basis function in the ξ -axis and the sequence of n control points
{B1 , B2 , . . . , Bn−1 , Bn } in the direction
{ of the ξ -axis, let m new knot values be added }
to the knot vector to obtain Ξ = ξ 1 , ξ 2 , ξ 3 , . . . , ξ n+m+ p−1 , ξ n+m+ p , ξ n+m+ p+1 .
{ }
Then, a new sequence of n + m control points B 1 , B 2 , . . . , B n+m−1 , B n+m are
calculated using
⎧ ⎫ ⎧ ⎫ ⎡ p p ⎤⎧ ⎫
⎨ B1 ⎪
⎪ ⎬ ⎪
⎨ B1 ⎪
⎬ T1,1 · · · T1,n ⎨ B1 ⎪
⎪ ⎬
.. . ⎢ . .. .. ⎥ ..
⎪ . ⎪ = T ⎪ .. ⎪ = ⎣ .. . . ⎦⎪ . ⎪ (6.3.5)
⎩ ⎭ ⎩ ⎭ p p ⎩ ⎭
B n+m Bn Tn+m,1 · · · Tn+m,n Bn
where each component of the transformation matrix T is, respectively, obtained as

{ ( [ ))
1 ξ i ∈ ξ j , ξ j+1
Ti,0j = (6.3.6)
0 (otherwise)
ξ i+k − ξ j k−1 ξ j+k+1 − ξ i+k k−1

Ti,k j = T + T (6.3.7)
ξ j+k − ξ j i, j ξ j+k+1 − ξ j+1 i, j+1
In each row of the transformation matrix in Eq. (6.3.5), only p + 1 consecutive

components at most are nonzero, making the matrix sparse when there are many
control points. It indicates that only the control points in the effective range of the
added knot values are changed.
When adding new knot values into knot vectors in two- and three-dimensional
cases, knot values as well as control points are sequentially added and updated per
axis: Firstly, transformation in one axis is performed by Eq. (6.3.5) for each control
point sequence along the axis, and then that in the other axis in the same manner.
When adding and updating control points by the knot insertion with Eqs. (6.3.5)–
(6.3.7), the shape of the object remain unchanged, and the added or updated control
points are located to the neighborhood region associated with the inserted knot value,
which are located closer to the object than the original ones.
[ Here, ] let’s define some terms related to the knot span and the segment. The interval
ξi , ξ j defined by the two components ξi and ξ j (i < j) of a knot vector is called a
knot span. If ξi /= ξ j , it is called a nonzero knot span, and if j = i + 1, a single knot
span. Among the curves and surfaces defined in Eqs. (6.3.1) and (6.3.2), the curve
element corresponding to a single knot span and the surface element corresponding
to the direct product of two single knot spans are called a curve segment and a surface
segment, respectively.
As for the curve segment, [a pth] order curve segment corre-
sponding
{ to a single knot span ξi , ξi+1 is defined } by a knot vector
ξi− p , ξi− p+1 , . . . , ξi−1 , ξi , ξi+1{, ξi+2 , . . . , ξi+ p , ξi+ p+1 with} 2( p + 1) compo-
nents and p + 1 control points ⎧Bi− p , Bi− p+1 , . . . ,⎫Bi−1⎧ , Bi . When the knot vector
⎫
⎨ ⎬ ⎨ ⎬
of a segment has a structure of a, . . . , a , b, . . . , b or c, a, . . . , a , b, . . . , b, d ,
⎩ ! " ! "⎭ ⎩ ! " ! " ⎭
p+1 p+1 p p
the segment is called a Bezier segment.
Consider dividing the surface shown in Fig. 6.9d into 25 surface segments
along the knot lines. First, to divide the surface into segments in the ξ direction,
knot values are added to the original knot vector {0, 0, 0, 0, 1, 2, 3, 4, 5, 5, 5, 5} to
construct a new knot vector {0, 0, 0, 0, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5, 5}. The
basis functions constructed from this new knot vector is shown in Fig. 6.10.
Because of the repeated knot values, the basis functions are C 0 continuous at the
repeated knot values except for both ends, and the knot spans on both sides of the
repeated knot values share only one control point at the knot value. This means that
both the knot spans and the curve segments defined on them are separable at the
knot value without overlapping. As an example, for the sequence of control points
corresponding to η = 0, a new sequence of control points is generated from the
original one based on Eq. (6.3.5) as follows:
⎧ ⎫ ⎡ 3 ⎤⎧ ⎫
⎨ P 1,1 ⎪
⎪ ⎬ T1,1 · · · T1,8
3
⎨ P1,1 ⎪
⎪ ⎬
.. ⎢ .. .. . ⎥ .
⎪
=
. ⎪ ⎣ . . .. ⎦⎪ .. ⎪ (6.3.8)
⎩ ⎭ ⎩ ⎭
P 16,1 3
T16,1 · · · T16,8
3
P8,1
Fig. 6.10 Knot insertion 1.2
1.0
0.8
0.6
0.4
0.2
0.0
0.0 1.0 2.0 3.0 4.0 5.0
ξ
In the case of dividing the new control points along the knot line into five segments,
the control points and knot vectors belonging to each segment are shown in Table
6.2. In the case of dividing the surface into 25 surface segments as shown in Fig. 6.11,
those of the two surface segments A and B in the figure are shown in Table 6.3 as
examples. It can be seen that all the segments shown in Tables 6.2 and 6.3 are Bezier
segments.
For Bezier segments, it is possible to elevate or reduce the orders of the basis
functions. The procedure for elevating the order of a one-dimensional Bezier (curve)
segment is explained as follows. When a p-th order segment consisting of p+1 control
Table 6.2 Division into five segments

Line segment Control points Knot vector
1 P 1,1 , P 2,1 , P 3,1 , P 4,1 {0, 0, 0, 0, 1, 1, 1, 2}
2 P 4,1 , P 5,1 , P 6,1 , P 7,1 {0, 1, 1, 1, 2, 2, 2, 3}
3 P 7,1 , P 8,1 , P 9,1 , P 10,1 {1, 2, 2, 2, 3, 3, 3, 4}
4 P 10,1 , P 11,1 , P 12,1 , P 13,1 {2, 3, 3, 3, 4, 4, 4, 5}
5 P 13,1 , P 14,1 , P 15,1 , P 16,1 {3, 4, 4, 4, 5, 5, 5, 5}
Fig. 6.11 Surface division
Table 6.3 Surface segments shown in Fig. 6.11

Surface segment Control points Knot vector
A P 7,4 , P 8,4 , P 9,4 , P 10,4 {1, 2, 2, 2, 3, 3, 3, 4} for ξ
{0, 1, 1, 1, 2, 2, 2, 3} for η
P 7,5 , P 8,5 , P 9,5 , P 10,5
P 7,6 , P 8,6 , P 9,6 , P 10,6
P 7,7 , P 8,7 , P 9,7 , P 10,7
B P 10,1 , P 11,1 , P 12,1 , P 13,1 {2, 3, 3, 3, 4, 4, 4, 5} for ξ
{0, 0, 0, 0, 1, 1, 1, 2} for η
P 10,1 , P 11,1 , P 12,1 , P 13,1
P 10,1 , P 11,1 , P 12,1 , P 13,1
P 10,1 , P 11,1 , P 12,1 , P 13,1
{ }
points B1 , B2 , . . . , B p , B p+1 is to be elevated in its degree, the p +2 control points
{ }
B 1 , B 2 , . . . , B p+1 , B p+2 that constitute the elevated ( p + 1)-th order segment are
obtained by the following equation,
i −1 p+2−i
Bi = Bi−1 + Bi (i = 1, . . . , p + 2) (6.3.9)
p+1 p+1
On the other hand, the order reduction procedure for a one-dimensional Bezier
(curve) segment is given{ as follows. When }a p-th order segment consisting of
p + 1 control points B1 , B2 , . . . , B p , B p+1 is to be reduced in its degree, the
{ }
new p control points B 1 , B 2 , . . . , B p−1 , B p of the ( p − 1)-th order segment are
generated according to the odd–even of p as follows:
In the case of p is even:
⎧
⎪
⎪ Bi (i = 1)
⎪
⎨ Bi −αi B i−1 (i = 2, . . . , r )
Bi = 1−αi
Bi+1 −(1−αi+1 )B i+1
(6.3.10)
⎪
⎪ (i = r + 1, . . . , p − 1)
⎪
⎩ αi+1
B p+1 (i = p)
where
p−2 i −1
r= , αi = (6.3.11)
2 p
In the case of p is odd:

⎧
⎪
⎪ B (i = 1)
⎪ B i−α B
⎪
⎪
⎪ i i i−1
(i = 2, . . . , r − 1)
⎨ L 1−αRi
Br +Br
Bi = (i = r ) (6.3.12)
⎪
⎪ 2
⎪
⎪ Bi+1 −(1−αi+1 )B i+1
(i = r + 1, . . . , p − 1)
⎪
⎪ αi+1
⎩
B p+1 (i = p)
where
p−1 i −1
r= , αi = (6.3.13)
2 p
Bi − αi B i−1 Bi+1 − (1 − αi+1 )B i+1

BrL = , BrR = (6.3.14)
1 − αi αi+1
Though it is possible to reduce the order of the basis functions of a one-dimensional

Bezier (curved) segment, it should be noted that it results in the reduction of accuracy
of the shape expressed by the basis functions.
In the case of curved surfaces, the order elevation or reduction operation is

performed in a similar manner. Note that the original shape is preserved for the
order elevation, but not necessarily for the order reduction.
Here, some examples of Bezier surface segments are shown: Fig. 6.12a shows a
segment defined with the first-order B-spline basis functions, Fig. 6.12b that defined
with second-order ones, and Fig. 6.12c that defined with third-order ones. The coor-
dinates of control points and knot vectors used to generate these geometries are
summarized in Table 6.4.
6.4 Local Contact Search for Surface-to-Surface Contact
In the present section, a contact search method between smooth contact surfaces is
studied based on the conventional contact search method.
It is well-recognized that the isogeometric analysis using NURBS as the basis
functions can be applied to the dynamic contact analysis.
First, consider the dynamic analysis based on the isogeometric analysis. In the
dynamic explicit method of the finite element method, the first-order basis functions
are usually used because the higher-order basis functions often require smaller value
of the time step Δt, increasing the computational load.
On the other hand, in the isogeometric analysis using the NURBS (B-spline) basis
functions, the constraint on the time step Δt is not severe even if the higher-order
basis functions are used. Figure 6.13 shows the maximum time step, with which the
one-dimensional wave propagation analysis under the condition of constant nodal
(control point) spacing can be stably performed. The horizontal axis is the order of the
basis function and the vertical axis the maximum time step width for stable analysis,
shown as the ratio to that in the finite element analysis with the basis functions of the
first order. It can be seen from the figure that in the case of the finite element analysis,
Δt becomes smaller as the order of the basis functions increases, while in the case
of using the B-spline basis functions, the constraint on Δt is conversely relaxed as
the order increases.
As shown in Fig. 6.14, this tendency is even more pronounced when the knot
values are repeated several times (see Sects. 6.2 and 6.3). The horizontal axis is the
knot multiplicity and the vertical axis the maximum time step Δt for stable analysis,
which is again a ratio to that for the case of the finite element analysis with the basis
functions of the first order. As can be seen from the figure, the use of the NURBS basis
functions relaxes the constraint on the time step in the explicit dynamic analysis.
Next, consider the contact analysis, especially the contact search with the isogeo-
metric analysis. The contact search differs greatly between the ordinary finite element
method and the isogeometric analysis using NURBS as the basis function. As shown
in Sect. 6.1, in the former, the local contact search between contact surfaces is based
on the calculation of distances between nodes on one of the contact surface and
segments on the other.
6.4 Local Contact Search for Surface-to-Surface Contact 189
Fig. 6.12 a Bezier surface

segment of the first order. b
Bezier surface segment of
the second order. c Bezier
surface segment of the third
order
The reason why the contact search can be attributed to the contact search between
a point and a surface (point-to-surface type) is that the nodes are always on the contact
surface, and also the shape of the segment is simple when the linear basis functions
are used (see Fig. 6.12a).
On the other hand, when the contact surface is a NURBS surface, the local
contact search between the contact surfaces is performed between the Bezier surface
Table 6.4 Specifications of Bezier surface segments

Bezier segment Control points Knot vector
1st order P1,1 (0, 0, 0), P1,2 (0, 3, 0.5) {0, 0, 1, 1} for ξ , η
(Fig. 6.12a) P2,1 (3, 0, −0.5), P2,2 (3, 3, 0)
2nd order P1,1 (0, 0, 0), (0, 1.5, −0.5), P1,3 (0, 3, 0) {0, 0, 0, 1, 1, 1} for ξ , η
(Fig. 6.12b) P2,1 (1.5, 0, −0.5), (0, 1.5, 0.5), P2,3 (1.5, 3, −
0.5)
P3,1 (3, 0, 0), (3, 1.5, −0.5), P3,3 (3, 3, 0)
3rd order P1,1 (0, 0, 0), (0, 1, 0.5), (0, 2, −0.5), P1,4 (0, 3, {0, 0, 0, 0, 1, 1, 1, 1} for ξ , η
(Fig. 6.12c) 0)
P2,1 (1, 0, 0.5), (1, 1, −0.5), (1, 2, 0.5), P2,4 (1,
3, −0.5)
P3,1 (2, 0, −0.5), (2, 1, 0.5), (2, 2, −0.5), P3,4 (2,
3, 0.5)
P4,1 (3, 0, 0), (3, 1, −0.5), (3, 2, 0.5), P4,4 (3, 3,
0)
Fig. 6.13 Time step versus 2.0

order of basis functions
Critical Time Step (Normalized)
1.5
1.0 FEA
NURBS
0.5
1 2 3 4 5
Order of Basis Functions
segments generated by dividing the contact surfaces. Unlike the point-to-surface type
contact search in the finite element method where a node is used as a representative
point of a contact surface, a Bezier surface segment does not have obvious control
points that represent the shape. This is because the control points constituting a
Bezier surface segment are not necessarily located on the segment. For this reason,
a difficulty arises in using some control points for contact detection.
As an example, Fig. 6.15 shows two Bezier surface segments facing each other
and their control points. In the figure, the control points of the lower segment are
shown in red, and those of the upper segment in blue. The two segments are not in
contact, but their control points intersect each other, which indicates the difficulty of
using some control points for contact detection in this case.
6.4 Local Contact Search for Surface-to-Surface Contact 191
2.5
Critical Time Step (Normalized)

2.0
1.5
FEA
NURBS-2nd
1.0 NURBS-3rd
NURBS-4th
NURBS-5th
0.5
1 2 3 4 5
Knot Multiplicity
Fig. 6.14 Time step versus multiplicity of knot values
Fig. 6.15 Bezier segments in proximity
Note, even in the contact search between two Bezier surface segments, the point-
to-surface type contact search can be employed to judge the contact state between
them. Specifically, a lot of points are set on one of the contact segments as shown
in Fig. 6.16, and the point-to-surface type contact search according to Eqs. (6.1.10)
and (6.1.11) is performed between each set of points and the other segment, then
the contact state at each point can be determined based on the signed distance in
Eq. (6.1.12).
Fig. 6.16 Local contact search between Bezier segments
But, in order to accurately calculate the state of contact between segments, it is

necessary to set many points on the segment, which increases the amount of computa-
tion. In addition, in each step of Newton–Raphson iteration for solving Eqs. (6.1.10)
and (6.1.11), a multiply-and-add operation of basis functions and control point coor-
dinates of Bezier surface segments are required to calculate the coordinates of the
contact point in the three-dimensional space, which is much more time consuming
because the order of the basis functions used there is higher than that in the conven-
tional point-to-surface search in the finite element analysis with the basis functions
of the first order.
Figure 6.17 shows the amount of computation per iteration in the Newton–
Raphson method. The horizontal axis is the order of the basis functions, and the
vertical axis the amount of computation per iteration, which is defined as the ratio
to that in the finite element analysis with the basis functions of the first order. The
computational complexity increases as the order of the basis functions increases. For
example, the computational complexity with the third-order basis functions is nearly
10 times of that with the first-order basis functions in the finite element analysis.
Note that the number of iterations required increases only slightly as the order of the
basis functions increases.
As described so far, the surface-to-surface contact search is a much more compu-
tationally demanding process than the point-to-surface contact search in the finite
element analysis.
6.5 Local Contact Search with Deep Learning
In this section, to solve the computational load of the surface-to-surface local search,
a fast and stable local contact search method using feedforward neural networks and
deep learning is presented.
Now, let’s look again at the contact between segments. Figure 6.18a shows two
Bezier surface segments in contact, and Fig. 6.18b a rotated version of the segment
pair of Fig. 6.18a. The contact conditions (local coordinates of the contact points,
penetration depth, etc.) in Fig. 6.18a, b are identical. Thus, the contact state is
6.5 Local Contact Search with Deep Learning 193
Fig. 6.17 Computational 20

complexity of node-segment
Ratio of Computational Complexity

algorithm
15
10
0
2 3 4 5
Order of NURBS basis functions
invariant to translation and rotation, indicating that the contact state between Bezier
surface segments is determined only by the shape of both segments and their relative
arrangement.
Therefore, the local contact search is regarded as a process to obtain a mapping
from the shape and relative arrangement of the two segments to the contact state
between them. By constructing this mapping on a feedforward neural network,
the surface-to-surface local contact search can be performed without iterative
computation [11, 13, 14].
Thus, the surface-to-surface local contact search using a feedforward neural
network based on deep learning consists of the following three phases [11].
(1) Data Preparation Phase: Setting a number of pairs of Bezier surface segments
with various shapes and relative arrangements, and, for each pair, the contact
state between the two segments is calculated using the method described in
Sect. 6.4. In this way, a large number of data pairs, called training patterns, of
shape and relative arrangements of segments, and corresponding contact states
are collected.
Fig. 6.18 Two different pairs of contacting segments with the same state of contact
(2) Training Phase: The patterns collected in the Data Preparation Phase are used
to train a feedforward neural network through deep learning with the following
condition:
Input data: shape and relative arrangements of segments,
Teacher data: contact states between the segments.
(3) Application Phase: The feedforward neural network trained in the Training
Phase is incorporated into the contact analysis code. The trained neural network
promptly outputs the contact state between two Bezier segments given, based on
their shapes and relative arrangements. Thus, the fast surface-to-surface local
contact search is performed.
Here, the constraints on the input data are discussed. The shape of a Bezier surface
segment is defined by the sum of the products of the basis functions and the control
points as shown in Sect. 6.3. The basis function of the Bezier surface segment is
common among segments of the same order, so the shape of a surface segment is
effectively determined only by the arrangement of the control points.
Now, consider the relative arrangement of two Bezier surface segments. A Bezier
surface segment of the p-th order consists of the following ( p + 1)2 control points.
P1,1 P1,2 · · · P1, p+1

.. .. .. .. (6.5.1)
. . . .
Pp+1,1 Pp+1,2 · · · Pp+1, p+1
A pair of segments can be moved by translation and rotation without changing

their shapes and relative arrangement in order to place control points of one of the
segments at the prescribed positions; the control point P1,1 at the origin, Pp+1,1 on the
x-axis, and P1, p+1 on the xy-plane [11]. The detailed procedure is explained below
and also shown in Fig. 6.19.
(1) Translate a segment and place the control point P1,1 at the origin. (Translation)
(2) Rotate the segment around the z-axis and place the control point Pp+1,1 on the
xz-plane. (Rotation)
Fig. 6.19 Translation and rotations of a segment

6.5 Local Contact Search with Deep Learning 195
(3) Rotate the segment around the y-axis and place the control point Pp+1,1 on the
x-axis. (Rotation)
(4) Rotate the segment around the x-axis and place the control point P1, p+1 on the
xy-plane. (Rotation)
Though the operations above are designed based on only one of the segments of a
pair, they are simultaneously performed at both of the segments of the pair, making
several degrees of freedom of control points of one segment constrained while the
relative arrangement of the two segments of the pair remain unchanged. Thus, the
total number of shape parameters of the two segments is reduced and the patterns
with the same relative arrangement are consolidated to one, which enables efficient
learning in the Training Phase.
Note that the surface-to-surface local contact search can be applied to Bezier
surface segments of various orders. It can be used in such various combinations as
the contact search between a Bezier surface segment based on the quadratic basis
functions and that based on the cubic basis functions, and that between a segment
of the fourth order and that of the fifth order. In addition, it can be employed for the
cases where the order of Bezier surface segments differs in each axis.
The number of shape parameters of Bezier surface segments varies depending on
the order of the basis functions. Then, when using feedforward neural networks for
local contact search, a neural network has to be constructed for each combination of
the two Bezier surface segments with different orders. However, this is not necessarily
efficient.
We could mitigate this inefficiency by making use of the properties of the Bezier
segments. As shown in Sect. 6.3, the order elevation or reduction can be applied to a
Bezier surface segment. If a Bezier surface segment of arbitrary order is approximated
by that of a predetermined order, then we have only to construct a feedforward neural
network for the pairs of approximated segments of the prescribed order. Thus, the
local contact search between any pair of segments of various orders can be performed
only by a single feedforward neural network. Here, a Bezier surface segment of
arbitrary order is approximated with that of the second order.
However, as shown in Sect. 6.3, a higher-order segment has a higher ability to
represent complex shape (see Fig. 6.12a–c), so the approximation accuracy can be
a problem when, for example, approximating a Bezier surface segment of the fifth
order with that of the second order. In this problem, subdivision of a segment could
be adopted: the complexity of the shape of the smaller segments generated by adding
new knot values and repartitioning the original segments is significantly reduced
[11].
The above process makes it possible to perform the surface-to-surface contact
search with a single feedforward neural network by performing subdivision until
it can be approximated with sufficient accuracy by Bezier surface segments of the
second order.
We show here a numerical example of the local contact search method using deep
learning described in Sect. 6.5.
Let’s discuss an application of the surface-to-surface local contact search using deep
learning to segment pairs whose basis functions are second-order NURBS in both
axes. Here, a feedforward neural network is trained to estimate the contact state
between segments using patterns generated from a large number of segment pairs
(both are Bezier surface segments) with various configurations.
Firstly, a lot of segment pairs are generated. After placing the nine control points
that constitute one of the pair of Bezier surface segments of the second order called
the master segment in the grid reference position (Table 6.5), all the coordinates are
modified by adding uniform random numbers with x-, y-, and z-coordinates of P1,1 ,
the y- and z-coordinates of P3,1 , and the z-coordinate of P1,3 being fixed. The range
of the modification is set to [−0.3, 0.3] for x-, y-, and z-coordinates, and the weight
of each control point in the range of [0.5, 2.0] using a uniform random number. Thus,
a lot of second-order Bezier surface segments of various shapes are generated.
The other segment of the segment pair called the slave segment is also gener-
ated through the same procedure above, then random rotation and translation are
performed on the slave segment. Specifically, for a slave segment, we perform the
rotation around the z-axis in the range of (−π, π ), that around the x-axis in the range
of (−π/4, π/4), then that around the y-axis in the range of (−π/4, π/4), and then
the translation in the range of [−2.0, 2.0] in x- and y-directions and in the range of
[0.0, 1.0] in z-direction. As a result, a large number of segment pairs, i.e., a lot of
pairs of a master segment and the corresponding slave segment with various shapes
and relative positions are created.
Out of 72 (= 18 × 4) parameters for the coordinates and weights of the total 18
control points that make up the two Bezier surface segments of the second order, the
66 parameters excluding the six fixed coordinates of the master segment represent
the shape and relative arrangement of each segment pair.
Table 6.5 Control points of

P1,1 (−1.0, −1.0, P1,2 (0.0, −1.0, 0.0) P1,3 (1.0, −1.0, 0)
the second-order master
0.0)
segment
P2,1 (−1.0, 0.0, P2,2 (0.0, 0.0, 0.0) P2,3 (1.0, 0.0, 0.0)
0.0)
P3,1 (−1.0, 1.0, P3,2 (0.0, 1.0, 0.0) P3,3 (1.0, 1.0, 0.0)
0.0)
Secondly, for each of a large number of segment pairs generated above, the contact
state is calculated using the method described in Sect. 6.4. The number of sampling
points on the slave segment side is set to 121 (= 11 × 11) located in an equally
spaced grid pattern in each direction of (ξ, η). The contact state data to be estimated
can be selected arbitrarily. Here, the followings are selected as examples of contact
state data.
(a) In contact or not in contact: 1 if in contact, 0 if not in contact.

If segments are in contact, then the( Cfollowing) data
( C areC to
) be estimated.
(b) Coordinates
( C C) of the contact points ξ S , η C
S and ξ M , η M : Local coordinates
ξS , ηS of the point on the slave (segment) that penetrates the master segment
the deepest, and local coordinates ξM C
, ηM
C
of the projection point of the slave
point onto the master segment.
(c) Average penetration depth of contact area Dcontact : Average value of penetration
depth over penetrating sampling points.
(d) Area ratio of the contact area Scontact : Percentage of sample points that penetrate
the master segment out of 121 sample points placed on the slave segment.
In this manner, we can obtain a large number of data pairs (patterns) consisting of
shapes and relative arrangement of two segments and contact states between them.
Here, 50, 000 patterns representing segment pairs in contact and 50, 000 patterns
representing those not in contact are generated.
From 50, 000 patterns representing segments in contact with each other, 25, 000
patterns are selected at random for training and the remaining 25, 000 patterns are
used for verification of the generalization capability.
Three feedforward neural networks to identify each of the contact states are
constructed using the training patterns;
( neural
) networks
( C C )are trained to estimate the
coordinates of the contact points ξSC , ηSC and ξM , ηM , the average penetration
depth Dcontact , and the area ratio Scontact , respectively. The number of units in the
input layer is 66 for all the above neural networks, and the number of units in the
output layer is 4, 1, and 1, respectively.
Based on the results of training with neural networks of various sizes, the neural
network for predicting the coordinates of the contact point has been set to have 5
hidden layers and 40 units per hidden layer. In the same way, that for estimating
Dcontact 3 hidden layers and 40 units per hidden layer, and that for estimating Scontact
2 hidden layers and 40 units per hidden layer.
The estimation accuracy of the trained neural network for estimating the coordinates
of the contact point is shown in Fig. 6.20, where
( C CFig.) 6.20a shows the distribution
of the
( C C) estimation error of the master side ξ , η
M ( M and ) Fig. 6.20b the slave side
ξS , ηS . The estimation errors of the slave side ξSC , ηSC are a little larger than those
of the master side, but both of them are estimated with good accuracy.
The estimation accuracies of the neural networks for the mean penetration depth
and the area ratio are shown in Figs. 6.21 and 6.22, respectively. Both figures show
the distribution of errors in the standardized data range where the maximum value
is 1 and the minimum value 0. Although the estimation accuracy of these values
Fig. 6.20 Distributions of 10000

Master
errors in estimation of a
ξMC , ηC and b ξ C , ηC by ξ
M S S
deep learning. Reprinted
8000 η
Number of Patterns

from Springer 6000
4000
2000
0
-0.4 -0.2 0 0.2 0.4
Error
(a) Master
10000
Slave
ξ
8000
η
Number of Patterns
6000
4000
2000
0
-0.4 -0.2 0 0.2 0.4
Error
(b) Slave
References 199

errors in estimation of Learning Patterns
Test Patterns
Dcontact by deep learning.
8000
Reprinted from [11] with Dcontact
Number of Patterns
permission from Springer
6000
4000
2000
0
-0.4 -0.2 0 0.2 0.4
Error

errors in estimation of Learning Patterns
Scontact by deep learning. Test Patterns
8000
Reprinted from [11] with
Scontact
Number of Patterns
permission from Springer

6000
4000
2000
0
-0.4 -0.2 0 0.2 0.4
Error
is lower than that of the contact point coordinates, it can be said the trained neural
networks can estimate them well.
As described above, it is shown possible to predict various contact states in detail
in the surface-to-surface contact search by using deep learning.
References
1. Bathe, K.-J.: Finite Element Procedures, Prentice-Hall (1996)

2. Benson, D.J., Hallquist, J.O.: A single surface contact algorithm for the post-buckling analysis
of shell structures. Comput. Methods Appl. Mech. Eng. 78, 141–163 (1990)
3. Ericson, C.: Real-time collision detection. Morgan Kaufmann (2005)
4. Cottrell, J.A., Hughes, T.J.R., Bazilevs, Y.: Isogeometric Analysis. Wiley (2009)
5. Hallquist, J.O., Goudreau, G.L., Benson, D.J.: Sliding interfaces with contact-impact in large-
scale Lagrangian computations. Comput. Methods Appl. Mech. Eng. 51, 107–137 (1985)
6. Hughes, T.J.R., Cottrell, J.A., Bazilevs, Y.: Isogeometric Analysis: CAD, finite elements,
4135–4195 (2005)
7. Konyukhov, A., Izi, R.: Introduction to Computational Contact Mechanics: A Geometric
Approach. Wiley (2015)
8. Konyukhov, A., Schweizerhof, K.: Computational Contact Mechanics: Geometrically Exact
Theory for Arbitrary Shaped Bodies. Springer (2012)
9. Laursen, T.A.: Computational Contact and Impact Mechanics: Fundamentals of modeling
interfacial phenomena in nonlinear finite element analysis. Springer (2002)
10. Liu, W.N., Meschke, G., Mang, H.A.: A note on the algorithmic stabilization of 2d contact
analyses. Computational Methods in Contact Mechanics IV (edited by Gaul, L. and Brebbia,
C.A.), Wessex Institute, 1999, pp. 231–240.
11. Oishi, A., Yagawa, G.: A surface-to-surface contact search method enhanced by deep learning.
Comput. Mech. 65, 1125–1147 (2020)
12. Oishi, A., Yamada, K., Yoshimura, S., Yagawa, G.: Domain decomposition based parallel
contact algorithm and its implementation to explicit finite element analysis. JSME Int. J. 45A(2),
123–130 (2002)
13. Oishi, A., Yoshimura, S.: A new local contact search method using a multi-layer neural network.
Comput. Model. Eng. Sci. 21(2), 93–103 (2007)
14. Oishi, A., Yoshimura, S.: Genetic approaches to iteration-free local contact search. Comput.
Model. Eng. Sci. 28(2), 127–146 (2008)
15. Piegl, L., Tiller, W.: The NURBS Book 2nd ed. Springer (2000)
16. Puso, M.A., Laursen, T.A.: A 3D contact smoothing method using Gregory patches. Int. J.
Numer. Methods Eng. 54, 1161–1194 (2002)
17. Rogers, D.F.: An Introduction to NURBS with Historical Perspective. Academic Press (2001)
18. Sevilla, R., Fernandez-Mendez, S., Huerta, A.: NURBS-enhanced finite element method
(NEFEM). Int. J. Numer. Methods Eng. 76, 56–83 (2008)
19. Sevilla, R., Fernandez-Mendez, S., Huerta, A.: 3D NURBS-enhanced finite element method
(NEFEM). Int. J. Numer. Methods Eng. 88, 103–125 (2011)
20. Wang, F., Cheng, J., Yao, Z.: FFS contact searching algorithm for dynamic finite element
analysis. Int. J. Numer. Methods Eng. 52, 655–672 (2001)
21. Wriggers, P.: Computational contact mechanics. John Wiley & Sons (2002)
22. Zhong, Z.H.: Finite Element Procedures for Contact -Impact Problems. Oxford U.P. (1993)
Chapter 7
Flow Simulation with Deep Learning
Abstract In the previous chapters, we have studied various topics related to the
application of deep learning to solid mechanics. In this chapter, we will discuss the
application of deep learning to fluid dynamics problems. Section 7.1 describes the
basic equations of fluid dynamics, Sect. 7.2 the basics of the finite difference method,
one of the most popular methods for solving fluid dynamics problems, Sect. 7.3 a
practical example of a two-dimensional fluid dynamics simulation, Sect. 7.4 the
formulation of the application of deep learning to fluid dynamics problems, Sect. 7.5
recurrent neural networks that are suitable for the time-dependent problems covered
in this chapter, and finally, Sect. 7.6 a real application of deep learning to the fluid
dynamics simulation.
7.1 Equations for Flow Simulation
First, let’s derive the basic equations of fluid mechanics (dynamics), which consist
of the three conservation laws as.
the law of the conservation of mass,
the law of the conservation of momentum, and
the law of the conservation of energy,
together with the constitutive equation, that describe the properties of specific fluids.
Assume both the velocity v(m/s) and mass density ρ(kg/m3 ) of the fluid to be
functions of position and time as follows:
⎛ ⎞
u(x, y, z, t )
v = ⎝ v(x, y, z, t) ⎠, ρ = ρ(x, y, z, t) (7.1.1)
w(x, y, z, t)
Let’s consider a small rectangular parallelepiped in a flow field as shown in

Fig. 7.1. Assuming that there is no inflow or outflow inside the rectangular paral-
lelepiped, the following equation can be derived from the fact that the time variation
https://doi.org/10.1007/978-3-031-11847-0_7
202 7 Flow Simulation with Deep Learning
of the mass inside it is equal to the sum of the masses entering and leaving it (the
law of the conservation of mass).
{ ( ) }
∂ ∂(ρu)
(ρdxdydz) = ρudydz − ρu + dx dydz
∂t ∂x
{ ( ) }
∂(ρv)
+ ρvdzdx − ρv + dy dzdx
∂y
{ ( ) }
∂(ρw)
+ ρwdxdy − ρw + dz dxdy (7.1.2)
∂z
Rearranging this equation, we have the equation of continuity as follows:
∂ρ ∂(ρu) ∂(ρv) ∂(ρw)

+ + + =0 (7.1.3)
∂t ∂x ∂y ∂z
Next, consider the equation of motion for a small rectangular parallelepiped in

a flow field. Figure 7.2 illustrates the stresses (N/m2 ) acting on each surface of
the parallelepiped, where Fx , Fy , and Fz are the body forces per unit mass (N/kg).
Figure 7.3 shows only the components in the x-axis direction. From this figure,
Newton’s equation of motion in the x-axis direction is derived as
{( ) }
Du ∂σx x
ρdxdydz = σx x + dx dydz − σx x dydz
Dt ∂x
{( ) }
∂σ yx
+ σ yx + dy dzdx − σ yx dzdx
∂y
Fig. 7.1 Mass conservation in a flow field

7.1 Equations for Flow Simulation 203
{( ) }
∂σzx
+ σzx + dz dxdy − σzx dxdy
∂z
+ Fx ρdxdydz (7.1.4)
Rearranging this equation, we have the equation of motion in the x-axis direction
as
Du ∂σx x ∂σ yx ∂σzx
ρ = + + + Fx ρ (7.1.5)
Dt ∂x ∂y ∂z
The equations of motion in the y- and z-axis directions can be obtained similarly
as
Dv ∂σx y ∂σ yy ∂σzy
ρ = + + + Fy ρ (7.1.6)
Dt ∂x ∂y ∂z
Dw ∂σx z ∂σ yz ∂σzz
ρ = + + + Fz ρ (7.1.7)
Dt ∂x ∂y ∂z
where Du , Dv , and Dw
Dt Dt Dt
are called the material derivatives, which take into account
the movement of matter, and defined by
Du ∂u ∂u ∂u ∂u
= +u +v +w
Dt ∂t ∂x ∂y ∂z
Dv ∂v ∂v ∂v ∂v
= +u +v +w
Fig. 7.2 Stresses

Fig. 7.3 Stresses along x-axis
Dw ∂w ∂w ∂w ∂w
= +u +v +w (7.1.8)
Note that Eqs. (7.1.5)–(7.1.7) are also called Euler’s equations of motion.
The constitutive equation of the Newtonian fluid can be written as
( )
σx x = − p + λ ∂∂ux + ∂∂vy + ∂w
∂z )
+ 2μ ∂∂ux
(
σ yy = − p + λ ∂∂ux + ∂∂vy + ∂w
∂z )
+ 2μ ∂∂vy
(
σzz = − p + λ ∂∂ux + ∂∂vy + ∂w + 2μ ∂w
( ) ∂z ∂z (7.1.9)
∂u ∂v
σx y = σ yx = μ ∂ y + ∂ x
( )
σ yz = σzy = μ ∂v + ∂w
( ∂z ∂y
)
σzx = σx z = μ ∂w∂x
+ ∂u
∂z
where p is the pressure (N/m2 ), μ the viscosity coefficient (N/m2 ), and λ the second
viscosity coefficient (N/m2 ). Note that a fluid taking Eq. (7.1.9) as its constitutive
equation is called the Newtonian fluid.
Substituting Eq. (7.1.9) into Eqs. (7.1.5) to (7.1.7), the Navier–Stokes equations
are obtained as follows:
{ ( )}
Du ∂p ∂ ∂u ∂v ∂w
ρ =− + λ + +
Dt ∂x ∂x ∂x ∂y ∂z
( ) ( ) ( )
∂ ∂u ∂ ∂v ∂ ∂w
+ μ + μ + μ
∂x ∂x ∂y ∂x ∂z ∂x
( ) ( ) ( )
∂ ∂u ∂ ∂u ∂ ∂u
+ μ + μ + μ
∂x ∂x ∂y ∂y ∂z ∂z
+ ρ Fx (7.1.10)
{ ( )}
Dv ∂p ∂ ∂u ∂v ∂w
ρ =− + λ + +
Dt ∂y ∂y ∂x ∂y ∂z
( ) ( ) ( )
∂ ∂u ∂ ∂v ∂ ∂w
+ μ + μ + μ
∂x ∂y ∂y ∂y ∂z ∂y
( ) ( ) ( )
∂ ∂v ∂ ∂v ∂ ∂v
+ μ + μ + μ
∂x ∂x ∂y ∂y ∂z ∂z
+ ρ Fy (7.1.11)
{ ( )}
Dw ∂p ∂ ∂u ∂v ∂w
ρ =− + λ + +
Dt ∂z ∂z ∂x ∂y ∂z
( ) ( ) ( )
∂ ∂u ∂ ∂v ∂ ∂w
+ μ + μ + μ
∂x ∂z ∂y ∂z ∂z ∂z
( ) ( ) ( )
∂ ∂w ∂ ∂w ∂ ∂w
+ μ + μ + μ
∂x ∂x ∂y ∂y ∂z ∂z
+ ρ Fz (7.1.12)
If the fluid is incompressible, the mass density ρ is constant, so Eq. (7.1.3) is

written as
∂u ∂v ∂w
+ + =0 (7.1.13)
∂x ∂y ∂z
Then, the Navier–Stokes equations for an incompressible fluid are written as

follows:
( 2 )
∂p ∂ u ∂2u ∂2u
ρ Du = − ∂x
+ μ + + + ρ Fx
( ∂ x2 ∂ y2 ∂z 2 )
Dt 2
∂
= − ∂ y + μ ∂∂ xv2 + ∂∂ yv2 + ∂∂zv2 + ρ Fy
2 2
ρ Dv p
(7.1.14)
Dt ( 2 )
∂p ∂ w ∂2w ∂2w
ρ Dt = − ∂ z + μ ∂ x 2 + ∂ y 2 + ∂z 2 + ρ Fz
Dw
Finally, let’s discuss the law of conservation of energy. The energy E of a fluid
per unit mass is given as the sum of kinetic energy, internal energy e, and potential
energy Ω as follows:
1( 2 )
E= u + v2 + w2 + e + Ω (7.1.15)
2
The energy in a small parallelepiped shown as Fig. 7.1 is E × ρdxdydz, and its
rate of change with time is given by
∂(ρ E)
dxdydz = Q̇dxdydz + Ẇ − Ė − q̇ (7.1.16)
∂t
where Q is the amount of heat generated inside or directly flowing in from outside the
fluid, q the amount of heat flowing out to the fluid around the small parallelepiped,
E the energy flowing out to the surroundings due to convection, W the work done
by the surroundings due to pressure or viscous forces, and (·) means the variation
per unit time.
Now, let’s look at each term of Eq. (7.1.16) in detail. First, Q̇ should be dealt with
individually after the specific heat source is determined, and for now, we assume as
∂Q
Q̇ = (7.1.17)
∂t
Next, Ẇ is calculated from the work done by the stress on each surface of the
infinitesimal parallelepiped. Let us calculate the work done by the stress in the x-
direction (see Fig. 7.4). In the AEHD surface, the stress in the x-direction is σx x ,
while the fluid velocity in this direction is u, meaning that the work done to the
parallelepiped is −uσx x dydz. The negative sign is due to the fact that the direction of
the stress and the direction of the(flow (displacement)) are opposite. Since the work
∂(uσx x )
on the opposite BCGF surface is uσx x + ∂ x dx dydz, the sum of the works on
these two surfaces is calculated as follows:
( )
∂(uσx x ) ∂(uσx x )
−uσx x dydz + uσx x + dx dydz = dxdydz (7.1.18)
∂x ∂x
The works in the x-direction on the surfaces ABFE and CDHG, and those on
the surfaces ADCB and EFGH are also calculated in the same manner; thus, all the
works in the x-direction is obtained.
Since the works in the y- and z-directions are calculated in the same way, we have
{ ( ) }
∂(uσx x ) ∂ uσx y ∂(uσx z )
Ẇ = + + dxdydz
∂x ∂y ∂z
{ ( ) ( ) ( )}
∂ vσ yx ∂ vσ yy ∂ vσ yz
+ + + dxdydz
∂x ∂y ∂z
{ ( ) }
∂(wσzx ) ∂ wσzy ∂(wσzz )
+ + + dxdydz (7.1.19)
∂x ∂y ∂z
Third, as for Ė, the energy inflow balance from each surface is calculated using
Fig. 7.5, showing the energy inflow and outflow per unit area for each surface. Since
Fig. 7.4 Balance of work due to stresses
E is the energy balance per unit mass, the energy inflow from the surface AEGD
is E × ρudydz, etc. Thus, the total balance of energy flow for the infinitesimal
parallelepiped is obtained as follows:
{ }
∂(ρu E) ∂(ρvE) ∂(ρwE)
Ė = + + dxdydz (7.1.20)
∂x ∂y ∂z
Fig. 7.5 Balance of energy with respect to convection

Fig. 7.6 Balance of heat flux
Finally, as for the term due to heat conduction q̇, the amount of heat flow in and
out per unit area of each surface is calculated with the temperature T and the heat
conduction coefficient κ according to Fourier’s law as shown in Fig. 7.6. Then, the
balance of the whole infinitesimal parallelepiped is calculated as
{ ( ) ( ) ( )}
∂ ∂T ∂ ∂T ∂ ∂T
q̇ = − κ + κ + κ dxdydz (7.1.21)
∂x ∂x ∂y ∂x ∂z ∂x
Substituting Eqs. (7.1.17), (7.1.19), (7.1.20), and (7.1.21) into Eq. (7.1.16) and
rearranging them using Euler’s equations of motion Eqs. (7.1.5)–(7.1.7), the consti-
tutive equation Eq. (7.1.9), the continuity equation Eqs. (7.1.3) and (7.1.15), the
energy equation is obtained as follows:
{ ( ) ( ) ( )}
De ∂Q ∂ ∂T ∂ ∂T ∂ ∂T
ρ = + κ + κ + κ
Dt ∂t ∂x ∂x ∂y ∂x ∂z ∂x
( )
∂u ∂v ∂w
−p + + +ϕ (7.1.22)
∂x ∂y ∂z
where ϕ is called the dissipation energy, defined as

[( )2) ( ( )
∂u ∂v 2 ∂w 2
ϕ = 2μ + +
∂x ∂y ∂z
{( ) ( ) ( ) }]
1 ∂u ∂v 2 ∂v ∂w 2 ∂w ∂u 2
+ + + + + +
2 ∂y ∂x ∂z ∂y ∂x ∂z
( )2
∂u ∂v ∂w
+λ + + (7.1.23)
∂x ∂y ∂z
From the above, the basic equations for fluid dynamics can be summarized as
The equation of continuity derived from the law of the conservation of mass:
∂ρ ∂(ρu) ∂(ρv) ∂(ρw)

+ + + =0 (7.1.24)
∂t ∂x ∂y ∂z
The equations of motion (the Navier–Stokes equations) derived from the law of
the conservation of momentum:
{ ( )}
Du ∂p ∂ ∂u ∂v ∂w
ρ =− + λ + +
Dt ∂x ∂x ∂x ∂y ∂z
( ) ( ) ( )
∂ ∂u ∂ ∂v ∂ ∂w
+ μ + μ + μ
∂x ∂x ∂y ∂x ∂z ∂x
( ) ( ) ( )
∂ ∂u ∂ ∂u ∂ ∂u
+ μ + μ + μ
∂x ∂x ∂y ∂y ∂z ∂z
+ ρ Fx (7.1.25)
{ ( )}
Dv ∂p ∂ ∂u ∂v ∂w
ρ =− + λ + +
Dt ∂y ∂y ∂x ∂y ∂z
( ) ( ) ( )
∂ ∂u ∂ ∂v ∂ ∂w
+ μ + μ + μ
∂x ∂y ∂y ∂y ∂z ∂y
( ) ( ) ( )
∂ ∂v ∂ ∂v ∂ ∂v
+ μ + μ + μ
∂x ∂x ∂y ∂y ∂z ∂z
+ ρ Fy (7.1.26)
{ ( )}
Dw ∂p ∂ ∂u ∂v ∂w
ρ =− + λ + +
Dt ∂z ∂z ∂x ∂y ∂z
( ) ( ) ( )
∂ ∂u ∂ ∂v ∂ ∂w
+ μ + μ + μ
∂x ∂z ∂y ∂z ∂z ∂z
( ) ( ) ( )
∂ ∂w ∂ ∂w ∂ ∂w
+ μ + μ + μ
∂x ∂x ∂y ∂y ∂z ∂z
+ ρ Fz (7.1.27)
The law of conservation of energy:

{ ( ) ( ) ( )} ( )
De ∂Q ∂ ∂T ∂ ∂T ∂ ∂T ∂u ∂v ∂w
ρ = + κ + κ + κ −p + +
Dt ∂t ∂x ∂x ∂y ∂x ∂z ∂x ∂x ∂y ∂z
+ϕ (7.1.28)
In case of incompressible fluids, where the internal energy e is written as e = cv T

with the specific heat at constant volume being cv , Eq. (7.1.28) is rewritten as
{ ( ) ( ) ( )}
DT ∂Q ∂ ∂T ∂ ∂T ∂ ∂T
ρcv = + κ + κ + κ +ϕ (7.1.29)
Dt ∂t ∂x ∂x ∂y ∂x ∂z ∂x
Equation (7.1.29) is also called the heat conduction equation. Using ϕ calculated
from the velocity field obtained by solving the continuity equation (Eq. (7.1.24)) and
the Navier–Stokes equations (Eqs. (7.1.25)–(7.1.27)), the temperature field can be
obtained from Eq. (7.1.29).
7.2 Finite Difference Approximation
In this section, difference approximations of derivatives are explained as the basis of

the finite difference method, which is often used as a numerical solution method for
differential equations in fluid dynamics.
It is well known that the derivative of a function φ(x) is defined as
|
dφ || φ(a + Δx) − φ(a)
| = lim (7.2.1)
dx x=a Δx→0 Δx
In numerical analysis, however, limit calculation above is difficult because of

finite numerical precision in computers. For this reason, difference approximation is
often used as a substitute. For example, the first derivative of a function is substituted
by its difference approximation as follows:
|
dφ || φ(a + Δx) − φ(a)
| ≈ (7.2.2)
dx x=a Δx
Equation (7.2.2) has a drawback that it does not provide any information about the
approximation accuracy, while the well-known Taylor expansion can be used as an
error-estimable difference approximation for the derivatives [2, 15], which is given
as follows:
| | |
dφ || 1 2 |
2 d φ| 1 3 |
3 d φ|
φ(a + Δx) = φ(a) + Δx + (Δx) + (Δx) + ···
dx |x=a 2 dx 2 |x=a 6 dx 3 |x=a
∞ |
Σ n |
1 n d φ|
= (Δx) (7.2.3)
n=0
n! dx n |
x=a
where
d 0φ
0! = 1, = φ(x) (7.2.4)
dx 0
7.2 Finite Difference Approximation 211
And the following equation holds,

|
Σ
M−1 n |
1 n d φ|
φ(a + Δx) = (Δx)
n=0
n! dx n |x=a
M |
|
1 M d φ|
+ (Δx) (0 ≤ θ ≤ 1) (7.2.5)
M! dx M |x=a+θΔx
Assuming a one-dimensional grid as shown in Fig. 7.7, we define the notations

as follows:
| |
(n) d n φ || (n) d n φ ||
φl ≡ φ(xl ), φl ≡ , φl+θ ≡ , h ≡ Δx (7.2.6)
dx n |x=xl dx n |x=xl +θΔx
Using these notations, the Taylor expansions (Eqs. (7.2.3) and (7.2.5)) can be
compactly written, respectively, as
φ(xl + Δx) = φl+1

1 1 1 1 (5) 5
= φl + φl(1) h + φl(2) h 2 + φl(3) h 3 + φl(4) h 4 + φ h + ···
2 6 24 120 l
(7.2.7)
Σ
M−1
1 (n) n 1 (M) M
φl+1 = φ h + φ h (0 ≤ θ ≤ 1) (7.2.8)
n=0
n! l M! l+θ
By rearranging Eq. (7.2.7) with respect to φl(1) , a difference approximation for the
first-order derivative is obtained as
φl+1 − φl 1 1 1 1 (5) 4
φl(1) = − φl(2) h − φl(3) h 2 − φl(4) h 3 − φ h − ···
h 2 6 24 120 l
φl+1 − φl
= + O(h) (7.2.9)
h
where O(h) is the term for the approximation error, which means that, as the grid
spacing h decreases, the approximation error decreases in proportion to the grid
spacing.
Fig. 7.7 One-dimensional grid

On the other hand, φl−1 = φ(xl − Δx) can be expressed by the Taylor expansion
as
1 1 1
φl−1 = φl + φl(1) (−h) + φl(2) (−h)2 + φl(3) (−h)3 + φl(4) (−h)4 + · · ·
2 6 24
(1) 1 (2) 2 1 (3) 3 1 (4) 4 1 (5) 5
= φl − φl h + φl h − φl h + φl h − φ h − · · · (7.2.10)
2 6 24 120 l
By rearranging Eq. (7.2.10) with respect to φl(1) , another difference approximation

of the first-order derivative is obtained as follows:
φl − φl−1 1 1 1 1 (5) 4
φl(1) = + φl(2) h − φl(3) h 2 + φl(4) h 3 − φ h + ···
h 2 6 24 120 l
φl − φl−1
= + O(h) (7.2.11)
h
In addition, another difference approximation of the first-order derivative is
derived from Eqs. (7.2.7) and (7.2.10), which are shown again in aligned form as
{
φl+1 = φl + φl(1) h + 21 φl(2) h 2 + 16 φl(3) h 3 + 1 (4) 4
φ h
24 l
+ 1
φ (5) h 5
120 l
+ ···
(7.2.12)
φl−1 = φl − φl(1) h + 21 φl(2) h 2 − 16 φl(3) h 3 + 1 (4) 4
φ h
24 l
− 1
φ (5) h 5
120 l
− ···
Taking the difference of these two equations above, we obtain
2 2 (5) 5
φl+1 − φl−1 = 2φl(1) h + φl(3) h 3 + φ h + ··· (7.2.13)
6 120 l
Rearranging Eq. (7.2.13) with respect to φl(1) , another difference approximation

of the first-order derivative is obtained as
φl+1 − φl−1 1 1 (5) 4
φl(1) = − φl(3) h 2 + φ h + ···
2h 6 120 l
φl+1 − φl−1 ( )
= + O h2 (7.2.14)
2h
( )
The approximation error in Eq. (7.2.14) is O h 2 , which means that Eq. (7.2.14)
is a more accurate approximation than Eqs. (7.2.9) and (7.2.11).
Equation (7.2.9) is called the forward difference approximation, Eq. (7.2.11) the
backward difference approximation, and Eq. (7.2.14) the central difference approx-
imation. Figure 7.8 shows a schematic diagram of these difference approximations.
A difference approximation for the second-order derivative can be derived by
summing two equations in Eq. (7.2.12) as follows:
2 (4) 4
φl+1 + φl−1 = 2φl + φl(2) h 2 + φ h + ··· (7.2.15)
24 l
Forward Difference
Backward Difference
Central Difference
Fig. 7.8 Approximation of first derivative
By rearranging Eq. (7.2.15) with respect to φl(2) , a difference approximation of

the second-order derivative is obtained as
φl+1 − 2φl + φl−1 2
φl(2) = − φl(4) h 2 + · · ·
h2 24
φl+1 − 2φl + φl−1 ( )
= 2
+ O h2 (7.2.16)
h
( )
where its approximation error behaves as O h 2 .
By using additional function values, it is possible to give more accurate difference
approximations.
For example, using function values at xl−2 and xl+2 in addition to those at xl−1 and
xl+1 , highly accurate finite difference approximations of the derivatives are obtained
as follows.
First, by replacing Δx with 2Δx in the Taylor expansion (Eq. (7.2.3)), we have
1 1 1
φl+2 = φl + φl(1) (2h) + φl(2) (2h)2 + φl(3) (2h)3 + φl(4) (2h)4 + · · ·
2 6 24
1 (2) 2 1 (3) 3 1
= φl + 2 · φl h + 4 · φl h + 8 · φl h + 16 · φl(4) h 4 + · · · (7.2.17)
(1)
2 6 24
Similarly, by replacing Δx with −2Δx in the Taylor expansion Eq. (7.2.3), we
have
1 1 1
φl−2 = φl + φl(1) (−2h) + φl(2) (−2h)2 + φl(3) (−2h)3 + φl(4) (−2h)4 + · · ·
2 6 24
1 1 1
= φl − 2 · φl(1) h + 4 · φl(2) h 2 − 8 · φl(3) h 3 + 16 · φl(4) h 4 − · · · (7.2.18)
2 6 24
Multiplying Eqs. (7.2.17), (7.2.7), (7.2.10), and (7.2.18) by a, b, c, and d,

respectively, and then by summing up them, we achieve
aφl+2 + bφl+1 + cφl−1 + dφl−2

= (a + b + c + d)φl + (2a + b − c − 2d)φl(1) h
1
+ (4a + b + c + 4d) φl(2) h 2
2
1 (3) 3
+ (8a + b − c − 8d) φl h
6
1
+ (16a + b + c + 16d) φl(4) h 4
24
1 (5) 5
+ (32a + b − c − 32d) φ h
120 l
+ ··· (7.2.19)
By setting the coefficient of φl(1) to 1 and the coefficients of φl(2) , φl(3) and φl(4) to
0 in Eq. (7.2.19), a finite difference approximation for φl(1) is obtained by solving the
following simultaneous linear equations.
⎧
⎪
⎪ 2a + b − c − 2d = 1
⎨
4a + b + c + 4d = 0
(7.2.20)
⎪ 8a + b − c − 8d = 0
⎪
⎩
16a + b + c + 16d = 0
Solving Eq. (7.2.20), we have
1 2 2 1
a=− , b= , c=− , d= , (7.2.21)
12 3 3 12
Substituting these values into Eq. (7.2.19), another finite difference approximation
of the first-order derivative is obtained as
−φl+2 + 8φl+1 − 8φl−1 + φl−2 1 (5) 4
φl(1) = +4· φ h + ···
12h 120 l
−φl+2 + 8φl+1 − 8φl−1 + φl−2 ( )
= + O h4 (7.2.22)
12h
This approximation formula has the fourth-order accuracy, more accurate than
the central difference method of the second-order accuracy Eq. (7.2.14).
In the same way, another difference approximation of φl(2) can be obtained by
setting the coefficient of φl(2) to 2 and the coefficients of φl(1) , φl(3) and φl(4) to 0 in
Eq. (7.2.19). Thus, the following simultaneous linear equations are to be solved.
⎧
⎪
⎪ 2a + b − c − 2d = 0
⎨
4a + b + c + 4d = 2
(7.2.23)
⎪
⎪ 8a + b − c − 8d = 0
⎩
16a + b + c + 16d = 0
Solving Eq. (7.2.23), we achieve
1 4 4 1
a=− , b= , c= , d=− , (7.2.24)
12 3 3 12
of the second-order derivative is obtained as follows:
−φl+2 + 16φl+1 − 30φl + 16φl−1 − φl−2 1 (6) 4
φl(2) = +8· φ h + ···
12h 2 720 l
−φl+2 + 16φl+1 − 30φl + 16φl−1 − φl−2 ( )
= 2
+ O h4 (7.2.25)
12h
This approximation formula has the fourth-order accuracy, which is more accu-
rate than the second-order accurate approximation of the second-order derivative
Eq. (7.2.16). Thus, by expanding the range of sampling function values used in the
finite difference approximation, it is possible to create formulae of derivatives of
various orders with different accuracy.
In addition, it is possible to set the range of function values to be sampled asym-
metrically with respect to the evaluation point of the derivative. For example, let’s
set a = 0 in Eq. (7.2.19), which results in using asymmetric four points in the
neighborhood of the evaluation point, φl+1 , φl , φl−1 and φl−2 . Thus, the following
simultaneous linear equations are to be solved to set the coefficient of φl(1) to 1 and
the coefficients of φl(2) , φl(3) and φl(4) to 0 in Eq. (7.2.19).
⎧
⎨ b − c − 2d = 1
b + c + 4d = 0 (7.2.26)
⎩
b − c − 8d = 0
Solving Eq. (7.2.26), we have
1 1
a = 0, b = , c = −1, d = , (7.2.27)
3 6
of the first-order derivative is obtained as follows:
2φl+1 + 3φl − 6φl−1 + φl−2 1
φl(1) = − 2 · φl(4) h 3 + · · ·
6h 24
2φl+1 + 3φl − 6φl−1 + φl−2 ( 3)
= +O h (7.2.28)
6h
This is a third-order finite difference approximation for the first-order derivative

with asymmetric sampling points.
The finite difference approximations of derivatives obtained above are summa-
rized as follows:
The forward difference approximation for the first-order derivative with accuracy
O (h):
φl+1 − φl
φl(1) = + O(h) (7.2.29)
h
The backward difference approximation for the first-order derivative with
accuracy O (h):
φl − φl−1
φl(1) = + O(h) (7.2.30)
h
The central difference approximation for the first-order derivative with accuracy
O (h2 ):
φl+1 − φl−1 ( )
φl(1) = + O h2 (7.2.31)
2h
The difference approximation for the first-order derivative with accuracy O (h3 ):
2φl+1 + 3φl − 6φl−1 + φl−2 ( )

φl(1) = + O h3 (7.2.32)
6h
The difference approximation for the first-order derivative with accuracy O (h4 ):
−φl+2 + 8φl+1 − 8φl−1 + φl−2 ( )

φl(1) = + O h4 (7.2.33)
12h
The difference approximation for the second-order derivative with accuracy O
(h2 ):
φl+1 − 2φl + φl−1 ( )

φl(2) = 2
+ O h2 (7.2.34)
h
The difference approximation for the second-order derivative with accuracy O
(h4 ):
−φl+2 + 16φl+1 − 30φl + 16φl−1 − φl−2 ( )

φl(2) = 2
+ O h4 (7.2.35)
12h
Then, consider finite difference approximation of the partial derivatives in two-
or three-dimensional space.
A two-dimensional grid is shown in Fig. 7.9. The partial derivatives in each axial
direction can be calculated in the same way as in the one-dimensional case. For
example, the difference approximations for the first-order partial derivatives by the
central difference are obtained as follows:
|
∂φ || φi+1, j − φi−1, j ( )
| = + O Δx 2 (7.2.36)
∂ x i, j 2Δx
|
∂φ || φi, j+1 − φi, j−1 ( )
| = + O Δy 2 (7.2.37)
∂ y i, j 2Δy
Similarly, Fig. 7.10 shows a three-dimensional grid. The partial derivatives in each
axial direction can be calculated in the same way as in the one-dimensional case. For
example, the difference approximations for the first-order partial derivatives by the
central difference are obtained as follows:
|
∂φ || φi+1, j,k − φi−1, j,k ( )
| = + O Δx 2 (7.2.38)
∂ x i, j,k 2Δx
|
∂φ || φi, j+1,k − φi, j−1,k ( )
| = + O Δy 2 (7.2.39)
∂ y i, j,k 2Δy
Fig. 7.9 Two-dimensional grid for finite difference method

Fig. 7.10 Three-dimensional grid for finite difference method
|
∂φ || φi, j,k+1 − φi, j,k−1 ( )
| = + O Δz 2 (7.2.40)
∂ x i, j,k 2Δz
It is noted that one-dimensional difference approximations of derivatives in space

can be easily converted to those in time by substituting x and h (or Δx) with t and
Δt, respectively.
7.3 Flow Simulation of Incompressible Fluid with Finite

Difference Method
In this section, the basic equations of fluid dynamics derived in Sect. 7.1 are
discretized using the finite difference approximation studied in Sect. 7.2, and a
method for obtaining the solution with a test result are presented.
7.3.1 Non-dimensional Navier–Stokes Equations
The equation of continuity and the Navier–Stokes equations for an incompressible

fluid in the three-dimensional space are expressed as follows (see Sect. 7.1):
7.3 Flow Simulation of Incompressible Fluid … 219
∂u ∂v ∂w
+ + =0 (7.3.1)
∂x ∂y ∂z
( 2 )
∂p ∂ u ∂2u ∂2u
ρ Du = − ∂x
+ μ + + + ρ Fx
( ∂ x2 ∂y ∂z )
Dt 2 2 2
∂p ∂ v ∂2v ∂2v
ρ Dt = − ∂ y + μ ∂ x 2 + ∂ y 2 + ∂z 2 + ρ Fy
Dv
(7.3.2)
( 2 )
∂p ∂ w ∂2w ∂2w
ρ Dw
Dt
= − ∂z
+ μ ∂x 2 + ∂y 2 + ∂z 2 + ρ Fz
Using the representative length L (m) and the representative speed U (m/s), we
define the non-dimensional quantities as follows:
x y z
x̃ = , ỹ = , z̃ = (7.3.3)
L L L
u v w
ũ = , ṽ = , w̃ = (7.3.4)
U U U
U
t˜ = t (7.3.5)
L
Using these non-dimensional values, Eqs. (7.3.1) and (7.3.2) are converted to the
non-dimensional equations as
∂ ũ ∂ ṽ ∂ w̃
+ + =0 (7.3.6)
∂ x̃ ∂ ỹ ∂ z̃
( )
∂ p̃ 1 ∂ 2 ũ ∂ 2 ũ ∂ 2 ũ
D ũ
D t˜
= − ∂ x̃
+ Re ( ∂ x̃ 2 + ∂ ỹ 2 + ∂ z̃ )
2 + F̃x
∂ 1 ∂ ṽ 2
∂ ṽ
2
∂ ṽ
2
D ṽ
D t˜
p̃
= − ∂ ỹ + Re 2 + ∂ ỹ 2 + ∂ z̃ 2 + F̃y (7.3.7)
( ∂2x̃ )
D w̃ ∂ p̃ 1 ∂ w̃ ∂ w̃
2
∂ 2 w̃
D t˜
= − ∂ z̃ + Re ∂ x̃ 2 + ∂ ỹ 2 + ∂ z̃ 2 + F̃z
where p̃, F̃i , and Re are also the non-dimensional values defined, respectively, as
p
p̃ = (7.3.8)
ρU 2
L
F̃i = Fi (i = x, y, z) (7.3.9)
U2
ρU L
Re = (7.3.10)
μ
Note that Re is called as the Reynolds number, which is an important indicator

for properties of fluid.
7.3.2 Solution Method
The flow field of an incompressible fluid can be obtained by solving Eqs. (7.3.6)
and (7.3.7). For simplicity, the external force term is assumed to be absent in the
following discussion.
Let ṽn and p̃ n be the velocity and pressure at the nth time step, respectively. Then,
by discretizing Eq. (7.3.7) with respect to time, the explicit equation that represents
ṽn+1 is obtained as
( )
ṽn+1 = L 1 ṽn , p̃ n+1 (7.3.11)
Taking the divergence of Eq. (7.3.11) and employing Eq. (7.3.6), Poisson’s
equation for pressure p̃ is obtained as follows:
( )
∇ · ∇ p̃ n+1 = L 2 ṽn (7.3.12)
where L 1 and L 2 are the differential operators.

The right-hand side of Eq. (7.3.12) is determined only by the velocity at the nth
time step (known quantity), while the right-hand side of Eq. (7.3.11) includes the
pressure at the (n + 1)-th time step (unknown quantity). Therefore, the solution
procedure for Eqs. (7.3.11) and (7.3.12) is given as follows:
Step 1: Solve Eq. (7.3.12) to obtain p̃ n+1 .
Step 2: Using p̃ n+1 obtained in Step 1, calculate ṽn+1 using Eq. (7.3.11).
Step 3: n → n + 1 and return to Step 1.
The assumption of incompressibility leads to Poisson’s equation about pressure
Eq. (7.3.12), whereas it is known that the incompressibility makes the system stiff.
In order to alleviate this, there is a method of introducing virtual compressibility into
the continuity equation (Eq. (7.3.6)) as follows:
1 ∂ p̃ ∂ ũ ∂ ṽ ∂ w̃
+ + + =0 (7.3.13)
β ∂τ ∂ x̃ ∂ ỹ ∂ z̃
where β is the pseudo-compression factor and τ the pseudo-time. This method is

called the pseudo-compressibility method or the artificial compressibility method
[8].
7.3 Flow Simulation of Incompressible Fluid … 221
7.3.3 Example: 2D Flow Simulation of Incompressible Fluid

Around a Circular Cylinder
Based on the discussion in the previous sections, this section presents an example of
analysis of a two-dimensional flow field using the finite difference method, which
will be used as the training data for deep learning in Sect. 7.6 [10].
The basic equations in the two-dimensional space are given as follows:
∂ ũ ∂ ṽ
+ =0 (7.3.14)
∂ x̃ ∂ ỹ
( )
D ũ
D t˜
= − ∂∂ x̃p̃ + Re
1 ∂ 2 ũ
2 +
∂ 2 ũ
( ∂ x̃ ∂ ỹ 2 )
(7.3.15)
∂ p̃ 1 ∂ 2 ṽ ∂ 2 ṽ
D ṽ
D t˜
= − ∂ ỹ + Re ∂ x̃ 2 + ∂ ỹ 2
The analysis domain is shown in Fig. 7.11, and the specifications for the analysis
are summarized in Table 7.1.
As for the boundary conditions, the inlet boundary has a uniform flow in the x-
direction, and the pressure and velocity at the outlet boundary are extrapolated from
nearby values. For the side boundaries, a constant pressure condition is imposed,
and the velocity is extrapolated from nearby values. These boundary conditions are
Circular Cylinder
Flow
Fig. 7.11 Analysis domain
Table 7.1 Specifications of

Grid size 1250 × 800
analysis parameters
Diameter of cylinder 40 grids
Raynords number 10,000
Total time steps 30,000
Time step width 1/160
not necessarily accurate, but they are proved to be accurate enough for data for deep
learning.
The finite difference method is employed as for the numerical solution
method together with the pseudo-compressibility method. For the discretization in
spatial domain, the Monotonic Upstream-centered Scheme for Conservation Laws
(MUSCL) approximation [8], a kind of upwind difference method, is used to achieve
third-order accuracy. Note that it took about 1.5 s per time step to perform the
simulation. (CPU: Intel Core-i7 2.5 GHz).
We will show an example of visualization of calculation results of velocity in what
follows. With the vorticity ω being the rotation of the velocity field, the vorticity of
velocity v = (u, v, w)T in the three-dimensional space is defined as follows:
⎛ ∂v
⎞
∂z
− ∂w
∂y
⎜ ∂w ∂x ⎟
ω =∇×v=⎝ ∂x
− ∂z ⎠ (7.3.16)
∂u ∂v
∂y
− ∂x
In the two-dimensional velocity field, the vorticity is expressed as a scalar quantity

in the form:
⎛ ⎞
0
⎜ ⎟
ω=⎝ 0 ⎠ (7.3.17)
∂u ∂v
∂y
− ∂x
Fig. 7.12 shows the time variation of vorticity around and behind the circular
cylinder, where the four images in the figure are, respectively, vorticity images at
time steps 500, 1000, 1500, and 2000 from top to bottom. Note that these images
only show those near and behind the cylinder, not at the entire analysis domain.
According to the visualization of the vorticity in the entire analysis domain, it is
confirmed that the flow generates the Kalman vortex train, and the vortex shedding
is repeated at regular intervals. It is also confirmed that twin vortices are formed by
the 500th time step, the twin vortices lose their symmetry near the 1000th time step,
the vortex detachment occurs near the 1500th time step, and the vortex trains are
released at the 3000th time step.
7.4 Flow Simulation with Deep Learning
In the previous sections, the basic equations and numerical solutions in fluid dynamics
are reviewed with an example of analysis of two-dimensional unsteady flow. As the
analysis of unsteady flow is one of the time-dependent problems, it is known to be
computationally demanding with a large number of time steps. In recent years, the
scale of computation has become larger and such complex phenomena as coupled
7.4 Flow Simulation with Deep Learning 223
Fig. 7.12 Vorticities around

cylinder at t = 500, 1000,
1500 and 2000 from top to
bottom
problems and multiphysics have been often performed, accelerating the increase of
computation time.
In numerical fluid dynamics analysis, the solution for the next calculation step is
calculated using the results (solution) of the past time steps. If the solution of the next
time step or ahead can be calculated or predicted without using numerical analysis,
it may lead to a significant reduction of computational load.
Here, a method employing deep learning to reduce the computational load in the
fluid analysis is discussed [10], where the prediction method consists of the following
three phases,
Data Preparation Phase: Perform a large number of fluid analyses or

a few long time fluid analyses, to find the analysis results of a certain
time step (tnow ), (Results )
CFD
(tnow ), those( at several
) past time steps (K time
steps), Results CFD
tpastK , . . . , ResultsCFD tpast1 , and those at some time step
(tfuture ) ahead (future)
{( of tnowCFD , Results
( ) (tfuture ). Thus,
CFD
( a large
) number of data )
pairs consisting of Results tpastK , . . . , ResultsCFD tpast1 , ResultsCFD (tnow ) ,
}
ResultsCFD (tfuture ) are collected.
Training Phase: The data pairs collected in the Data Preparation Phase above are
used to train a predictor (neural network) of the analysis results using deep learning.
The input and teacher data are set, respectively, as follows:
( ) ( )
Input data: ResultsCFD tpastK , . . . , ResultsCFD tpast1 , ResultsCFD (tnow )
Teacher data: ResultsCFD (tfuture )
The trained predictor (neural( network) ) will output( the ) predicted data
ResultsDL (tfuture ) when ResultsCFD tpastK , . . . , ResultsCFD tpast1 , ResultsCFD (tnow )
are input. ( )
Application Phase: When new input data ResultsCFD tpastK , . . . ,
( )
ResultsCFD tpast1 , ResultsCFD (tnow ), which are not included in the training data
in the Training Phase, are input to the predictor trained above, the predicted data,
ResultsDL (tfuture ) for the input data, are output.
Note that the analysis results ResultsCFD collected in the Data Preparation
Phase could be images visualizing the analysis results, physical quantities such
as velocity and pressure obtained by simulation or a mixture of them. It is also
important to( note) that the number (of past ) analysis results to be used as input data,
ResultsCFD tpastK , . . . , ResultsCFD tpast1 , should be chosen appropriately according
to the flow field to be predicted, and they are not required to be those at consecutive
time steps since, in numerical fluid dynamics analysis, the time interval is usually
kept small enough for stable analysis, and the results of analysis for consecutive time
steps are often considered to have a very high degree of similarity.
When employing the trained predictor in the Application Phase, the results at the
current and past time steps input to the predictor should not necessarily be those by
computational fluid analysis, but we can employ those predicted by the predictor,
meaning that various set of data can be used for input as follows:
( ) ( ) ( )
ResultsCFD tpastK , . . . , ResultsCFD tpast2 , ResultsCFD tpast1 , ResultsDL (tnow ),
( ) ( ) ( )
ResultsCFD tpastK , . . . , ResultsDL tpast2 , ResultsDL tpast1 , ResultsDL (tnow ),
and
( ) ( ) ( )
ResultsDL tpastK , . . . , ResultsDL tpast2 , ResultsDL tpast1 , ResultsDL (tnow ).
In the case of implicit analysis, the prediction result can be used as the initial
value of the iterative solution method.
7.5 Neural Networks for Time-Dependent Data 225
7.5 Neural Networks for Time-Dependent Data
Solutions of such dynamic problems as unsteady fluid analysis can be obtained as

time-series data. Video and audio data are also time-series ones.
In the neural networks taken in previous chapters, we assumed that the training
patterns are independent each other and not interrelated or sequential. However, when
predicting the solution of an unsteady fluid analysis in the next time step at a certain
point in time, it may not be appropriate to ignore the history up to that point. Then,
the neural network for prediction on the series data should have a mechanism to take
into account the history in order to utilize the characteristics of the series data.
In this section, recurrent neural networks [7] are summarized, which have been
developed as neural networks suitable for serial data, and then the long short-term
memory (LSTM) neural network [5, 9], which is regarded as an advanced type of
recurrent neural networks.
7.5.1 Recurrent Neural Network
The behavior of a standard unit in a feedforward neural network is given as

( nl−1 )
(p ) Σ
p
O lj = f U lj = f wl−1
ji ·
p
Oil−1 + θ lj (7.5.1)
i=1
where
p
O lj Output value of the activation function of the j-th unit in the l-th layer for the
p-th pattern,
p l
U j Input value to the activation function of the j-th unit in the l-th layer for the
p-th pattern.
On the other hand, the behavior of a unit in a recurrent neural network [7] is shown
as
⎛ ⎞
(p ) Σ
nl−1
Σ
nl
= f⎝ + θ lj ⎠
p l p−1 l
ROj = f RU j
l
wl−1
ji ·
p
Oil−1 + W lj j ' · R O j' (7.5.2)
i=1 j ' =1
where W lj j ' is the newly added connection weight between the j-th and j ' -th units
p p
of the l-th layer, and R at the bottom left of R O lj and R U lj indicates that they are
quantities related to units of recurrent type.
Let’s consider the function of the newly added second term on the right-hand side
p−1
of Eq. (7.5.2). Note that the superscript on the left shoulder of R O lj ' is p − 1, which
indicates R O lj ' is the output of the j ' unit of the l-th layer for the previous training
p−1
pattern. In other words, the second term on the right-hand side indicates that the
output of all the units in the l-th layer for the previous training pattern is taken into
account when performing the calculation for the current training pattern. Although
p−1
the term has only the value for the (p − 1)-th training pattern as R O lj ' , it should
p l p l
be noted that R U j and R O j are affected by all the previous training patterns because
p−1 l p−2 l p−2 l p−3 l
R O j ' is affected by R O j and R O j by R O j and so on in a recursive manner
as described in Eq. (7.5.2).
Then, what will be the update rule of the connection weight in the recurrent neural
network? Let’s compare the derivative values between the output of the normal unit
p l p
O j and that of the recurrent unit R O lj . From Eq. (7.5.1), the derivative of the output
O j of the standard unit with respect to wαβ
p l l−1
is given by
( ) ( nl−1 )
∂ O lj
p ∂f p
U lj ∂ p U lj ( ) Σ ∂wl−1
' p ji
= = f l
Uj · p Oil−1 (7.5.3)
∂wαβ
l−1
∂ p U lj ∂wαβ
l−1
i=1
∂wαβ
l−1
p
On the other hand, that of the recurrent unit output R O lj with respect to wαβ
l−1
can
be obtained from Eq. (7.5.2) as
( )
p l
p
∂ R O lj ∂f RU j
p
∂ R U lj
= p
∂wαβl−1
∂ R U lj
∂wαβ l−1
⎛ ⎞
( ) Σ
nl−1
∂w l−1 Σnl
∂
p−1 l
O '
= f ' R U lj ⎝ ⎠
p ji R j
Oi +
p l−1
W lj j ' (7.5.4)
i=1
∂w l−1
αβ '
j =1
∂w l−1
αβ
The last term in Eq. (7.5.4) is the derivative of the output of the recurrent unit
in the previous pattern. Thus, the recurrent neural networks cannot use the standard
error backpropagation algorithm for ordinary feedforward neural networks.
As learning methods for recurrent neural networks, backpropagation through time
(BPTT) [13] and real-time recurrent learning (RTRL) [14] have been developed. The
former is known to be simpler and computationally faster.
Consider a three-layer neural network as shown in Fig. 7.13 (left). Only the
middle layer is a recurrent layer. A simplified diagram of the network is shown
in Fig. 7.13 (right), and the computation for N training patterns is schematically
shown in Fig. 7.14. In practice, training cannot be done at the same time, and calcu-
lations are done in order from the first learning pattern to the N-th learning pattern.
This is because the output of the hidden layer for the previous training pattern is used
for the calculation of the current training pattern. Taking this into account, Fig. 7.14
can be rewritten as Fig. 7.15, which can be regarded as one large network, and the
BPTT method is so designed as applied to this combined network.
Fig. 7.13 Three-layer recurrent neural network
Fig. 7.14 Three-layer recurrent neural network for consecutive patterns
Fig. 7.15 Another schematic view of three-layer recurrent neural network for consecutive patterns
Assume that the squared error is given by
1 ΣΣ
n
3
( p 3 p )2
N
E= Ok − Tk (7.5.5)
2 p=1 k=1
First, let’s calculate the update of the connection weight between the hidden layer
and the output layer, which is given as
1 ΣΣ
n
∂ ( p 3 p )2
N
∂E 3
= Ok − Tk
∂wab
2 2 p=1 k=1 ∂wab
2
Σ
N Σ
n3
( ) ∂ p Ok3
= p
Ok3 − p Tk
p=1 k=1
∂wab
2
Σ
N
( ) ∂ p Oa3
= p
Oa3 − p Tk (7.5.6)
p=1
∂wab
2
where
( ) ( )
∂ p Oa3 ∂ f p Ua3 ∂ f p Ua3 ∂ p Ua3
= =
∂wab2
∂wab2 ∂ p Ua3 ∂wab 2
( ) ⎛ ⎞
∂ f p Ua3 ∂ Σ n2
= ⎝ wa2 j · R O 2j + θa3 ⎠
p
∂ p Ua3 ∂wab 2
j=1
( p 3)
∂ f Ua p 2
= · R Ob (7.5.7)
∂ p Ua3

( )
∂E ΣN
( p 3 p ) ∂ f p Ua3 p 2
= Oa − Tk · · R Ob (7.5.8)
∂wab
2
p=1
∂ p Ua3
From Eq. (7.5.8), it can be seen that the update of the connection weight between
∂E
the hidden layer and the output layer, ∂w 2 , is calculated using the value of each term
ab
on the right-hand side for each training pattern.
Next, let us calculate the update of the connection weight between input and
hidden layers,
1 ΣΣ
n
∂ ( p 3 p )2
N
∂E 3
= Ok − Tk
∂wcd
1 2 p=1 k=1 ∂wcd
1
Σ
N Σ
n3
( ) ∂ p Ok3
= p
Ok3 − p Tk (7.5.9)
p=1 k=1
∂wcd
1
where
( ) ( )
∂ p Ok3 ∂ f p Uk3 ∂ f p Uk3 ∂ p Uk3
= =
∂wcd1
∂wcd1
∂ p Uk3 ∂wcd 1
( ) ⎛ ⎞
∂ f p Uk3 ∂ Σ n2
= ⎝ wk2j · R O 2j + θk3 ⎠
p
∂ p Uk3 ∂wcd1
j=1
( p 3 ) ⎛ n2 ⎞
∂ f Ua Σ ∂
p 2
O
·⎝ ⎠
R j
= wk2j · (7.5.10)
∂ p Ua3 j=1
∂w 1
cd
and
( ) ( )
p 2
p
∂ R O 2j ∂f RU j ∂f p
U 2j ∂ p U 2
j
= =
∂wcd
1
∂wcd 1
∂ p U 2j ∂wcd 1
( ) ⎛ ⎞
∂ f p U 2j ∂ ⎝Σ
n1 Σ n2
W j2j ' · R O 2j ' + θ 2j ⎠
p−1
= w 1ji · p Oi1 +
∂ p U 2j ∂wcd 1
i=1 '
j =1
( p 3) ⎛ 1 ⎞
∂ f Ua ∂w jd p Σ n 2
∂
p−1 2
O '
· ⎝ 1 · Id + ⎠
R j
= W j2j ' · (7.5.11)
∂ p Ua3 ∂wcd '
j =1
∂w 1
cd
Equation (7.5.11) determines the derivative of the output of the j-th unit in the
hidden layer for the p-th learning pattern using that for the (p − 1)-th learning pattern.
Using the value above, Eqs. (7.5.10) and (7.5.9) are used in turn to determine the
∂E
value of ∂w 1 .
cd
∂E
Finally, let us calculate the update of the connection weight ∂ Wab
2 for the feedback
in the hidden layer as
1 ΣΣ
n
∂ ( p 3 p )2
N
∂E 3
= Ok − Tk
∂ Wab
2 2 p=1 k=1 ∂ Wab
2
Σ
N Σ
n3
( ) ∂ p Ok3
= p
Ok3 − p Tk (7.5.12)
p=1 k=1
∂ Wab
2
where
( ) ( )
∂ p Ok3 ∂ f p Uk3 ∂ f p Uk3 ∂ p Uk3
= =
∂ Wab2
∂ Wab
2
∂ p Uk3 ∂ Wab2
(p ⎛ ) ⎞
∂ ⎝Σ
n2
∂f Uk3
wk2j · R O 2j + θk3 ⎠
p
=
∂ p Uk3 ∂ Wab
2
j=1
( p 3 ) ⎛ n2 ⎞
∂ f Uk Σ ∂
p 2
O
·⎝ ⎠
R j
= wk2j · (7.5.13)
∂ p Uk3 j=1
∂ W 2
ab
and
( ) ( )
p 2
p
∂ R O 2j ∂f RU j ∂f p
U 2j ∂ p U 2
j
= =
∂ Wab
2
∂ Wab
2
∂ p U 2j ∂ Wab 2
( ) ⎛ ⎞
∂ f p U 2j ∂ ⎝Σ
n1 Σn2
W j2j ' · R O 2j ' + θ 2j ⎠
p−1
= w 1ji · p Oi1 +
∂ p U 2j ∂ Wab
2
i=1 '
j =1
( ) ⎛ ⎞
∂ f Ujp 2
∂ W jb
2 Σ n2
∂
p−1 2
O '
·⎝
p−1 2 R j ⎠
= Ob + W j2j ' · (7.5.14)
∂ p U 2j ∂ Wab2 R
j ' =1
∂ W 2
ab
From Eq. (7.5.14), the derivative value of the output of a unit in the hidden layer
2
with respect to Wab for the p-th learning pattern can be calculated using the output
values of the units in the hidden layer and their derivative values for the (p − 1)-th
training pattern. Using the value obtained above, the left-hand side values of Eqs.
(7.5.13) and (7.5.12) are calculated in order, and finally ∂∂WE2 is obtained.
ab
The update values (derivative values) for the bias values θk3 and θ 2j are calculated
in the same manner.
As described above, in BPTT, the amount of update of each parameter can be
calculated by computing the values sequentially according to the order of training
patterns.
7.5.2 Long Short-Term Memory
In this section, the basic items of long short-term memory (LSTM) are discussed [9],
which is an advanced version of recurrent neural networks [1, 7].
There is a process of sequentially calculating the gradients in the order of
the training patterns in the error backpropagation calculation for recurrent neural
networks, and it is known that the gradient vanishing problem may arise similarly to
the calculation of multiple layers in feedforward neural networks.
Therefore, quantities related to training patterns far apart each other cannot affect
the update of the connection weights, resulting in a network that makes predictions
by referring only to relatively last-minute information, which has been considered
to be a problem in operation based on long-term memory.
(output)
output gate
(input D)
cell
(input C) forget gate

input gate
(input B)
sum
multiplication
(input A)
Fig. 7.16 Schematic diagram of LSTM memory cell
The long short-term memory (LSTM) network is developed to fix the weaknesses
of recurrent neural networks above. The behavior of the unit in LSTM is shown in
Fig. 7.16 [6]. The LSTM unit in Fig. 7.16 is an extension of the original unit with
the forget gate [4] and peephole connections [3].
The LSTM unit in Fig. 7.16 is designed to perform four different recurrent
processes on the input, taking the input data x t at the current time step (training
pattern) and the output data yt−1 of the LSTM unit at the previous time step (training
pattern) as input, and outputting the output y tj at the current time step. For the same
input data (input A, input B, input C, and input D), the same operations as in the
usual recurrent unit are performed, and the output is calculated based on the results
of these operations. f A (), f B (), f C (), f D (), and f E () are the activation functions and
the sigmoid function is used for f B (), f C (), and f D (), while the tanh function for
f A () and f E (). Another feature of the LSTM unit is that it has a state variable s,
which is given the function of controlling long-term memory. The LSTM unit shown
in Fig. 7.16 has three gates, where the input gate controls the transmission strength
of new input information, the output gate the output strength of memory cells, and
the forget gate the transmission strength of past information.
Let us now look at the operations in the LSTM unit [6]. First, the following
operations are performed on input A as in the recurrent unit,
Σ
n1 Σ
n2
u A,t
j = w Aji xit + j' + θ j
W jAj ' y t−1 A
(7.5.15)
i=1 j ' =1
( )
g A,t
j = f A u A,t
j (7.5.16)
Next, the following operations are performed on input B as in the recurrent unit,
Σ
n1 Σ
n2
u B,t
j = w Bji xit + j' + θ j + pB · s j
W jBj ' y t−1 B t−1
(7.5.17)
i=1 j ' =1
( )
g B,t
j = f B u B,t
j (7.5.18)
Note that the state variable s t−1

j is added in Eq. (7.5.17).
Next, the following operations are performed on input C as in the recurrent unit.
Note again that the state variable s t−1
j is added.
Σ
n1 Σ
n2
u C,t
j = wCji xit + j ' + θ j + pC · s j
W jCj ' y t−1 C t−1
(7.5.19)
i=1 j ' =1
( )
g C,t
j = f C u C,t
j (7.5.20)
At the input gate, the product of g A,t j obtained by Eq. (7.5.16) and g B,tj by
C,t
Eq. (7.5.18) is calculated. At the forget gate, the product of g j obtained by
Eq. (7.5.20) and the state variable s t−1
j is calculated. The state variables are updated
by summing the results of the input gate and forget gate operations as follows:
s tj = g C,t
j · sj
t−1
+ g B,t A,t
j · gj (7.5.21)
Using the updated state variables, g E,t

j are calculated as follows:
( )
g E,t
j = f E s tj (7.5.22)
Next, the following operations are performed on input D as in the recurrent unit.
Σ
n1 Σ
n2
u D,t
j = w Dji xit + j' + θ j + pD · s j
W jDj ' y t−1 D t
(7.5.23)
i=0 j ' =1
( )
g D,t
j = f D u D,t
j (7.5.24)
Here, the updated state variables s tj are added.

Finally, at the output gate, the product of g E,t

j calculated in Eq. (7.5.22) and g D,t
j
in Eq. (7.5.24) is obtained as
y tj = g E,t D,t
j · gj (7.5.25)
This value becomes the output value of the LSTM unit.

In Eqs. (7.5.17), (7.5.19), and (7.5.23), p B , pC , and p D are parameters called
peephole weights.
The BPTT algorithm can be employed for error backpropagation training in LSTM
networks as well as recurrent networks. For more details, see Ref. [6, 9].
In this section, the application of deep learning to the analysis of a flow field around a
two-dimensional circular cylinder (Sect. 7.3.3) is given in detail [10], where convolu-
tional LSTM networks (Sect. 7.5) are employed. It predicts the vorticity visualization
image at a future time step using the previous visualization images obtained from
the fluid analysis results as input.
From the analysis results described in Sect. 7.3.3, visualization images of vorticity
and pressure are created for every 100 time steps between the 100th and the 7400th
time steps, achieving 74 images for each of vorticity and pressure. The images are
cropped so that the flow near the cylinder and behind the cylinder are focused. Each
image has an 8-bit grayscale (256 Gy scale) of 200 pixel width by 100 pixel height.
When used as input data, each pixel value is normalized to a real number in the range
of 0–1. Here, the visualized image of vorticity in the nth step is denoted as VICFD (n),
the image of pressure as PICFD (n), and both images as VPICFD (n).
Let’s predict the image of vorticity at the next time point from the images of those
at the last four time points.
In the case of predicting the next vorticity image from vorticity images at past
time points, the following 70 training patterns can be obtained from 74 images. (The
first four images in each pattern are the input data and the last one the teacher data.)
{( ) }
1{(VICFD (100), VICFD (200), VICFD (300), VICFD (400)), VICFD (500)}
2 VICFD (200), VICFD (300), VICFD (400), VICFD (500) , VICFD (600)
· · ·{( ) }
70 VICFD (7000), VICFD (7100), VICFD (7200), VICFD (7300) , VICFD (7400)
Similarly, in the case of predicting the next vorticity and pressure images from
the vorticity and pressure images at past time points, the images at 74 different times
provide the 70 training patterns shown as
{( ) }
1 VPICFD (100), VPICFD (200), VPICFD (300), VPICFD (400) , VPICFD (500)
{( ) }
· · ·{( ) }
Deep learning using the 70 training patterns created in Sect. 7.6.1 is performed here.
To build a predictor for the vorticity image, input and teacher data are set as
follows:
Input data: VICFD (n − 300), VICFD (n − 200), VICFD (n − 100), VICFD (n)
Teacher data: VICFD (n + 100)
The trained neural network outputs the predicted image of vorticity,
VIDL (n + 100), for the above input data. The number of epochs is set to 100,000,
and the mini-batch size to 10. Training took several dozen hours on the computer
equipped with Intel Core i7-7700 K CPU and NVIDIA TITAN V GPU.
Similarly, to build a predictor for both the vorticity and pressure images, input
and teacher data are set as follows:
Input data: VPICFD (n − 300), VPICFD (n − 200), VPICFD (n − 100), VPICFD (n)
Teacher data: VPICFD (n + 100)
The trained neural network outputs the predicted image of vorticity,
VIDL (n + 100), and also that of pressure, VIDL (n + 100), for the above input data.
The number of epochs is set to 100,000, and the mini-batch size to 10.
Regarding the neural network employed for deep learning, the convolutional
LSTM network used to build a predictor for vorticity images as an example has
the structure as follows:
1 : Input layer : Input : 4 × 200 × 100 Output : 20 × 200 × 100

2 : Convolutional LSTM layer Input : 20 × 200 × 100 Output : 20 × 200 × 100
6 : Convolutional 3D layer Input : 3 × 200 × 100 Output : 1 × 200 × 100
The first layer, i.e., the input layer, is a convolutional layer. The input data for this
layer are four 200 × 100 pixel vorticity images at four time steps. Each pixel of the
images is a single-precision real number between 0 and 1. The input layer converts
the input data into 20 × 200 × 100 data by convolution operation using 20 filters
(filter size 3 × 3) and sends the data to the next layer.
In the second to fourth convolutional LSTM layers, 20 × 200 × 100 input data are
converted to 20 × 200 × 100 new data by a two-dimensional convolution operation
using 20 filters (filter size 3 × 3), which are sent to the next layer.
In the fifth layer, which is also the convolutional LSTM layer, 20 × 200 × 100
input data are converted to 3 × 200 × 100 new data by a two-dimensional convolution
operation using three filters (filter size 3 × 3), which are sent to the next layer.
In the sixth layer, the convolutional three-dimensional layer, 3 × 200 × 100 input
data are converted to 1 × 200 × 100 new data by a three-dimensional convolution
operation using a single filter (filter size 3 × 3 × 3), which are output. As the output
image is based on real numbers, they are converted to integer values between 0 and
255, resulting in a normal 8-bit (256 shades) image.
Note that the convolutional LSTM layer used here is a convolutional neural
network with LSTM [11].
Here, the neural network constructed in Sect. 7.6.2 is used to predict the flow field
around a two-dimensional circular cylinder.
Figure 7.17 shows the predicted results for the data included in the training patterns,
depicting the vorticity images between 3200th and 3500th time steps. In the figure,
CFD denotes the image based on the analysis results obtained from the computational
fluid dynamics simulation, which is considered as the correct image here. The others
are, respectively, those by deep learning when the images based on the analysis
results are used as the input (DL with VICFD ) and those when the predicted images
are used as the input (DL with VIDL ). The images in the figure are achieved by the
trained neural network with the following data (images) as input data,
DL with V I C F D :
( )
t = 3200 VICFD (2800), VICFD (2900), VICFD (3000), VICFD (3100)
( )
( )
( )
DL with V I DL :
( )
t=3200 t=3300 t=3400 t=3500
CFD
(Correct)
DL
(with )
DL
(with )
Fig. 7.17 Predicted images of vorticities for time steps included in training patterns
( )
t = 3300 VICFD (2900), VICFD (3000), VICFD (3100), VIDL (3200)
( )
( )
The results in Fig. 7.17 show that deep learning is generally able to make good
predictions for this case. Here, the Structural Similarity Index Measure (SSIM) [12],
which is a similarity index of images, is about 70%. (The SSIM is 100% in the case
of an exact match.) When the prediction results by deep learning are also used as
input (DL with VIDL ), the accuracy is lower than when only CFD results are used as
input (DL with VICFD ), but it could be improved by data augmentation to suppress
overtraining.
Figure 7.18 shows the prediction results of the vorticity image for the data not
included in the training pattern, where the vorticity images at the 20200th time step
to the 20300th, 20400th, and 20500th time steps are given. Since the vorticity images
at these time steps are not included in the training patterns, it is considered that the
generalization capability for unknown input data is verified. Note that the images
used as input to the trained neural network are selected in the same manner as those
used for Fig. 7.17, which are shown as follows:
DL with VICFD :
( )
( )
( )
( )
DL with V I DL :
( )
( )
t=20200 t=20300 t=20400 t=20500
CFD
(Correct)
DL
(with )
DL
(with )
Fig. 7.18 Predicted images of vorticities for time steps not included in training patterns
( )
t = 20400 VICFD (20000), VICFD (20100), VIDL (20200), VIDL (20300)
( )
t = 20500 VICFD (20100), VIDL (20200), VIDL (20300), VIDL (20400)
The results in Fig. 7.18 show that deep learning is generally able to make good
predictions also for this case. Compared to Fig. 7.17, the prediction accuracy for the
input data not included in the training patterns is poorer than that for the input data
included in the training patterns. It is considered effective to suppress overtraining,
for example, by expanding the training patterns during training.
Next, Fig. 7.19 shows the prediction results of the vorticity image when both the
vorticity and the pressure images are predicted simultaneously, where the time steps
of the predicted images are from the 3200th to the 3400th steps and the input data for
deep learning are images calculated by computational fluid dynamics analysis. For
comparison, the images predicted only from the vorticity images (DL with VICFD )
are also shown. It is noted that the results with both vorticity and pressure images as
input (DL with VPICFD ) are comparable to those with only vorticity images as input
(DL with VICFD .
Finally, Fig. 7.20 shows the prediction results of the pressure image when both the
vorticity and the pressure images are predicted simultaneously, where the time steps
of the predicted images are from the 3200th to the 3400th steps and the input data
for deep learning are images calculated by computational fluid dynamics analysis.
According to the prediction result of the pressure image by deep learning (DL with
VPICFD ), the location of the low pressure area (black in the figure) is almost consistent
with that by the CFD, but there is more noise than that of the vorticity image.
As described above, it can be seen that deep learning using the convolutional
LSTM network can predict well the results of computational fluid dynamics analysis.
The time required for prediction by deep learning is less than one second in order to
obtain a solution 100 steps ahead, while, in the case of CFD, it is necessary to analyze
whole time steps along the way by time-consuming computational fluid dynamics
simulation. The prediction accuracy by deep learning could be further improved by
optimizing the convolutional LSTM network structure and increasing the number of
training patterns to prevent overtraining and improve the generalization capability.
t=3200 t=3300 t=3400
CFD
(Correct)
DL
(with )
DL
(with )
Fig. 7.19 Predicted images of vorticities using images of vorticities and pressures
t=3200 t=3300 t=3400
CFD
(Correct)
DL
(with )
Fig. 7.20 Predicted images of pressures using images of vorticities and pressures
Acknowledgements We would like to express our gratitude to Dr. Masato Masuda, Prof. Yasushi
Nakabayashi, and Prof. Yoshiaki Tamura for providing the data for this chapter. Figures in
Sections 7.3 and 7.6 are based on the provided data. We are also grateful to Prof. Yoshiaki Tamura for
his kind advice on the description of Section 7.3 and to Dr. Masato Masuda for that of Sections 7.5
and 7.6.
References
1. Elman, J.L.: Finding structure in time. Cogn. Sci. 14, 179–211 (1990)
2. Ferziger, J.H., Peric, M.: Computational Methods for Fluid Dynamics (Second Edition).
Springer (1999)
3. Gers, F.A., Schmidhuber, J.A.: Recurrent nets that time and count. Proceedings of the
IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural
Computing: New Challenges and Perspectives for the New Millennium, Vol. 3, 2000,
pp. 189–194, DOI: https://doi.org/10.1109/IJCNN.2000.861302.
References 239
4. Gers, F.A., Schmidhuber, J.A., Cummins, F.A.: Learning to Forget: Continual Prediction with
LSTM. Neural Comput. 12(10), 2451–2471 (2000). DOI: https://doi.org/10.1162/089976600
300015015
5. Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning, MIT Press (2016)
6. Greff, K., Srivastava, R.K., Koutník, J., Steunebrink, B.R., Schmidhuber, J.: LSTM: A Search
Space Odyssey. IEEE Trans. Neural Netw. Learn. Sys. 28(10), 2222–2232 (2017). DOI: https://
doi.org/10.1109/TNNLS.2016.2582924.
8. Hirsh, C.: Numerical Computation of Internal and External Flows: The Fundamentals of
Computational Fluid Dynamics (Second Edition). Butterworth-Heinemann (2007)
9. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780
(1997)
10. Masuda, M., Nakabayashi, Y., Tamura, Y.: Prediction of computational fluid dynamics results
using convolutional LSTM. Transactions of JSCES. 2020, 20201006 (2020). (in Japanese)
11. Shi, X., Chen, Z., Wang, H., Yeung, D.-Y.,Wong, W.-K., Woo, W.-C.: Convolutional LSTM
Network: a machine learning approach for precipitation nowcasting. In Proceedings of the 28th
International Conference on Neural Information Processing Systems (NIPS’15), Vol. 1, 2015,
pp. 802–810.
12. Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: from error
visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004). DOI:
https://doi.org/10.1109/TIP.2003.819861.
13. Williams, R.J., Peng, J.: An efficient gradient-based algorithm for on-line training of recurrent
network trajectories. Neural Comput. 2, 490–501 (1990)
14. Williams, R.J., Zipser, D.: A learning algorithm for continually running fully recurrent neural
network. Neural Comput. 1, 270–280 (1989)
15. Zienkiewicz, O.C., Morgan, K.: Finite Elements and Approximation, Dover (2006)
Chapter 8
Further Applications with Deep Learning
Abstract In this chapter, some additional applications of deep learning in the field
of computational mechanics are discussed: a method of improving the accuracy of
element stiffness matrices (Sect. 8.1), finite element analysis using convolutional
operations (Sect. 8.2), fluid analysis using variational autoencoders (Sect. 8.3), a
zooming method using feedforward neural networks (Sect. 8.4), and an application
of physics-informed neural networks to solid mechanics (Sect. 8.5).
8.1 Deep Learned Finite Elements
In the finite element method, methods for improving the accuracy of solutions can
be classified into two main categories shown as follows:
A; Methods reducing the size of elements with a large number of elements
B; Methods increasing the order of the basis functions without reducing the
element size
Method A improves the accuracy of solutions by reducing the element size with
low-order basis functions, which results in reducing the variation of physical quan-
tities (such as displacements) in an element. On the other hand, Method B improves
the accuracy of solutions by taking advantage of the high approximation capability
of the basis functions of higher order.
The accuracy improvement of approximation of the basis functions in an element
naturally results in that of the element stiffness matrix, and finally the better finite
element solution. In other words, the quality of the element stiffness matrices and the
global stiffness matrix directly affects the accuracy of the finite element solutions.
Then, we have discussed a method to improve the quality of the element stiffness
matrix by optimizing the numerical integration parameters using deep learning in
Chap. 4. Here, another method is reviewed, where the strain–displacement matrix
involved in the element integration is improved by deep learning [9].
https://doi.org/10.1007/978-3-031-11847-0_8
242 8 Further Applications with Deep Learning
8.1.1 Two-Dimensional Quadratic Quadrilateral Element
First, let’s review the quadratic quadrilateral isoparametric element which is of

interest in this section. As shown in Fig. 8.1, it has eight nodes in total, and
the displacements and coordinates at an arbitrary position in an element are
approximated, respectively, as follows:
( ) ( ) Σ8 ( )
u u(ξ, η) Ui
{u} = = = Ni (ξ, η) · (8.1.1)
v v(ξ, η) Vi
i=1
( ) ( ) Σ8 ( )
x x(ξ, η) Xi
{x} = = = Ni (ξ, η) · (8.1.2)
y y(ξ, η) Yi
i=1
where (Ui , Vi )T and (X i , Yi )T are the displacements and coordinates of the i-th node
belonging to the element, respectively.
The basis functions in Eqs. (8.1.1) and (8.1.2) are given by the following equations.
1
N1 (ξ, η) = (1 − ξ )(1 − η)(−ξ − η − 1) (8.1.3)
4
1
N2 (ξ, η) = (1 + ξ )(1 − η)(ξ − η − 1) (8.1.4)
4
Fig. 8.1 Quadratic

quadrilateral element
4 7 (0,1) 3
(-1,1) (1,1)
8 6
(-1,0) 0 (1,0)
1 5 (0,-1) 2
(-1,-1) (1,-1)
8.1 Deep Learned Finite Elements 243
1
N3 (ξ, η) = (1 + ξ )(1 + η)(ξ + η − 1) (8.1.5)
4
1
N4 (ξ, η) = (1 − ξ )(1 + η)(−ξ + η − 1) (8.1.6)
4
1( )
N5 (ξ, η) = 1 − ξ 2 (1 − η) (8.1.7)
2
1 ( )
N6 (ξ, η) = (1 + ξ ) 1 − η2 (8.1.8)
2
1( )
N7 (ξ, η) = 1 − ξ 2 (1 + η) (8.1.9)
2
1 ( )
N8 (ξ, η) = (1 − ξ ) 1 − η2 (8.1.10)
2
These basis functions apparently satisfy the fundamental equations in the finite
element approximation as shown below.
Σ
8
Ni (ξ, η) = 1 (for arbitrary ξ, η) (8.1.11)
i=1
{
( ) 0 (i /= j)
Ni X j , Y j = (8.1.12)
1 (i = j)
The vector {U } of nodal displacements in an element is defined as

⎛ ⎞
U1
⎜ . ⎟
⎜ .. ⎟
⎜ ⎟
⎜ ⎟
⎜U ⎟
{U } = ⎜ 8 ⎟ (8.1.13)
⎜ V1 ⎟
⎜ ⎟
⎜ .. ⎟
⎝ . ⎠
V8
Further, the matrix [N ] of the shape (basis) functions in an element is defined as

[ ]
N1 · · · N8 0 · · · 0
[N ] = (8.1.14)
0 · · · 0 N1 · · · N8
Then, the displacement vector {u} at an arbitrary point in an element is given as

⎛ ⎞
U1
⎛ 8 ⎞ ⎜ . ⎟
Σ ⎜ .. ⎟
[ ]⎜
⎜
⎟
⎟
⎜ i=1 Ni Ui ⎟
{u} = ⎜ ⎟ = N1 · · · N8 0 · · · 0 ⎜⎜
U8 ⎟
⎟ = [N ]{U } (8.1.15)
⎝Σ 8 ⎠ 0 · · · 0 N1 · · · N8 ⎜ V1 ⎟
Ni Vi ⎜ . ⎟
⎜ . ⎟
i=1 ⎝ . ⎠
V8
The strain {ε} at any point in an element can be expressed using the nodal
displacement vector {U } as follows:
⎛ ⎞ ⎛ ∂u
⎞ ⎡
∂
⎤
εx ∂x ∂x
0 ( )
⎜ ∂v ⎟ ⎢ 0 ∂ ⎥ u
{ε} = ⎝ ε y ⎠ = ⎝ ∂y ⎠=⎣ ∂y ⎦ = [L]{u} = [L][N ]{U } = [B]{U }
∂u ∂v ∂ ∂ v
γx y ∂y
+ ∂x ∂y ∂x
(8.1.16)
where [B] is the strain–displacement matrix, and its components are shown as
⎡ ⎤ ⎡ ∂N ⎤
∂
∂x
0 [ ] ∂x
· · · ∂∂Nx8 0 · · · 0
1
⎢ ∂ ⎥ N1 · · · N8 0 · · · 0 ⎢ ∂N ∂N ⎥
[B] = [L][N ] = ⎣ 0 ∂y ⎦ = ⎣ 0 · · · 0 ∂ y1 · · · ∂ y8 ⎦
0 · · · 0 N1 · · · N8 ∂ N1
∂ ∂
∂y ∂x ∂y
· · · ∂∂Ny8 ∂∂Nx1 · · · ∂∂Nx8
(8.1.17)
The stress {σ } at any point in an element can also be expressed using the nodal
displacement vector {U } as follows:
⎛ ⎞ ⎛ ⎞
σx εx
{σ } = ⎝ σ y ⎠ = [D]⎝ ε y ⎠ = [D]{ε} = [D][L][N ]{U } = [D][B]{U }
τx y γx y
(8.1.18)
where [D] is the stress–strain matrix, which is defined by the Young’s modulus E
and the Poisson’s ratio ν for a two-dimensional isotropic elastic body. Note that [D]
is different between the plane stress and plane strain approximations.
In the case of plane stress approximation:
⎡ ⎤
1ν 0
E
[D] = ( )⎣ν 1 0 ⎦ (8.1.19)
1 − ν2
0 0 1−ν
2
In the case of plane strain approximation:

(a) (b)
Fig. 8.2 Quadratic quadrilateral element and corresponding reference model
⎡ ⎤
1−ν ν 0
E ⎣ ν 1−ν
[D] = 0 ⎦ (8.1.20)
(1 + ν)(1 − 2ν) 1−2ν
0 0 2
Using the matrices described above, the element stiffness matrix is achieved as
∫
[ e]
k = [B]T [D][B]dv (8.1.21)
ve
where v e denotes that the integral is performed over the entire domain of the element.
Similar to what discussed for linear elements in Chap. 4, the degradation of
the accuracy of the element stiffness matrix due to the distortion of the element
shape also occurs for quadratic quadrilateral elements. Let’s consider the strain in
a quadratic quadrilateral element shown in Fig. 8.2a due to nodal displacements
{ } ( )T
U = U 1 , · · · , U 8 , · · · , V 1 , · · · , V 8 , . The strain at an arbitrary position in the
element can be expressed by Eq. (8.1.16), but the result obtained may contain some
error if the element geometry is distorted.
Then, we discuss how we can obtain an accurate strain. One way is to divide
the element into many smaller elements as shown in Fig. 8.2b. For example, if the
original quadratic quadrilateral element is divided into 50 × 50 linear quadrilateral
elements, the strains in each element may be obtained with high accuracy. The sizes
of the elements in this case are considered to be sufficiently small that the linear
element would suffice, and the mesh of Fig. 8.2b is called the reference model of the
element of Fig. 8.2a.
Each nodal point on the periphery of the reference model is loaded by displace-
ments interpolated from those of nodal points of the original quadratic quadrilateral
element using quadratic basis functions and the strain values at arbitrary locations
within the element can be accurately calculated. Note that those calculated by the
reference model depend on the Poisson’s ratio, but not on Young’s modulus.
It has been reviewed so far that the distortion of the element shape affects the
accuracy of the element stiffness matrix even for quadratic quadrilateral elements,
and that the reference model can be used to calculate the correct strain field. With
the reference model, a method for obtaining a highly accurate strain–displacement
matrix [B] by deep learning will be discussed in what follows. In preparation, the
properties of the strain–displacement matrix [B] are now studied in detail.
As shown in Eq. (8.1.17), the strain–displacement matrix [B] of a quadratic
quadrilateral element is of 3 rows and 16 columns, which can be written as
⎡ ⎤
b1,1 b1,2 b1,15 b1,16
[B] = ⎣ b2,1 b2,2 · · · b2,15 b2,16 ⎦ (8.1.22)
b3,1 b3,2 b3,15 b3,16
Here, let the displacements in the element be those of rigid body due to translation,
( )T
i.e., (U1 , V1 )T = · · · = (U8 , V8 )T = U , V . In this case, the strain values in the
element must be zero. Therefore, we have the following equation.
⎛ ⎞
U
⎜ . ⎟
⎛ ⎞ ⎡ ⎤⎜ .
⎜ .
⎟
⎟
0 b1,1 b1,2 b1,15 b1,16 ⎜ ⎟
⎝ ⎠ ⎣
{ε} = 0 = [B]{U } = b2,1 b2,2 · · · b2,15 b2,16 ⎦⎜
⎜
U ⎟
⎟
⎜V ⎟
0 b3,1 b3,2 b3,15 b3,16 ⎜ ⎟
⎜ .. ⎟
⎝ . ⎠
V
⎛ ⎞
b1,1 + b1,2 + b1,3 + b1,4 + b1,5 + b1,6 + b1,7 + b1,8
= ⎝ b2,1 + b2,2 + b2,3 + b2,4 + b2,5 + b2,6 + b2,7 + b2,8 ⎠U
b3,1 + b3,2 + b3,3 + b3,4 + b3,5 + b3,6 + b3,7 + b3,8
⎛ ⎞
b1,9 + b1,10 + b1,11 + b1,12 + b1,13 + b1,14 + b1,15 + b1,16
+ ⎝ b2,9 + b2,10 + b2,11 + b2,12 + b2,13 + b2,14 + b2,15 + b2,16 ⎠V (8.1.23)
b3,9 + b3,10 + b3,11 + b3,12 + b3,13 + b3,14 + b3,15 + b3,16
Since U and V are independent of each other and Eq. (8.1.23) holds for any
( )T
U , V , we have
b1,1 + b1,2 + b1,3 + b1,4 + b1,5 + b1,6 + b1,7 + b1,8 = 0

b2,1 + b2,2 + b2,3 + b2,4 + b2,5 + b2,6 + b2,7 + b2,8 = 0
b3,1 + b3,2 + b3,3 + b3,4 + b3,5 + b3,6 + b3,7 + b3,8 = 0
(8.1.24)
b19 + b1,10 + b1,11 + b1,12 + b1,13 + b1,14 + b1,15 + b1,16 = 0
b29 + b2,10 + b2,11 + b2,12 + b2,13 + b2,14 + b2,15 + b2,16 = 0
b39 + b3,10 + b3,11 + b3,12 + b3,13 + b3,14 + b3,15 + b3,16 = 0
This suggests that components of the strain–displacement matrix [B] are not inde-
pendent of each other; for example, b1,8 can be obtained from the other components
by b1,8 = −b1,1 − b1,2 − b1,3 − b1,4 − b1,5 − b1,6 − b1,7 , leading to the result that the
number of independent components of the strain–displacement matrix [B] is reduced
to 3 × 14.
8.1.2 Improvement of Accuracy of [B] Matrix Using Deep

Learning
8.1.2.1 Data Preparation Phase
First, prepare a number of quadratic quadrilateral elements with various shapes,

where the four corner nodes of the elements are arranged as shown in Fig. 8.3, with
the nodes 1 and 2 fixed at (0,0) and (1,0), respectively, while the nodes 3 and 4 are
movable in various locations. It is also assumed that the maximum length of edges
is 1, and the middle nodes are set at the midpoints of the edges. This is called the
standard arrangement.
Note that any quadratic quadrilateral element can be transformed into the stan-
dard arrangement by translation, rotation, and scaling. Here, various values of the
Poisson’s ratio and nodal displacements are set for each of these elements. Let
Shape(i) be the shape parameters, Poisson(i) the Poisson’s ratio, and Disp(i) the
nodal displacements for the i-th setting, then each element is represented by 21
parameters in total as follows:
{ }
Shape(i) = X 3i , Y3i , X 4i , Y4i (8.1.25)
{ }
Poisson(i) = ν i (8.1.26)
Fig. 8.3 Normalized y

element geometry
3
x
1 2
{ i i i i
}
Disp(i) = U 1 , · · · , U 8 , V 1 , · · · , V 8 (8.1.27)
Next, a set of the exact values of strains at the integration points of the original
element, Strain(i), is calculated by the finite element analysis using the reference
model created for each element, which is assumed to be a finite element model
prepared by dividing the original element into 50 × 50 first-order quadrilateral
elements. The number of integration points is set to three per axis or nine per element.
Strain(i), which consists of three values per integration point, is defined as
{1 }
Strain(i) = ε̂xi , 1 ε̂iy , 1 γ̂xi y , 2 ε̂xi , 2 ε̂iy , 2 γ̂xi y , · · · , 9 ε̂xi , 9 ε̂iy , 9 γ̂xi y (8.1.28)
where the superscripts at the left shoulders of the components are the numbers
of integration points, and the superscripts on the right shoulders those of element
settings.
Thus, a set of data {Shape(i), Shape(i), Disp(i), Strain(i)} are obtained for each
element. We generate 300,000 sets of data as training patterns, and 30,000 sets of
data for validation of the trained network in addition.
8.1.2.2 Training Phase
Here, the sets of data collected above are used to construct a feedforward neural
network. As the input data to the feedforward neural network, the elemental shape
parameters and the Poisson’s ratio are employed, i.e.
Input data : Shape(i), Poisson(i)
On the other hand, as the teacher data of the feedforward neural network, the
strain–displacement matrix [g B] at the integration points (9 points in total) is used,
i.e.,
Teacher data : [g B](g = 1, · · · , 9)
Since each [g B] has 3 × 14 independent components, the number of parameters
in one teacher data is 378. Thus, the feedforward neural network to be constructed
should have 5 units in the input layer and 378 units in the output layer.
The error function to be minimized is assumed as follows:
(| | | | | |)
1 ΣΣ
N 9
| g εxi −g ε̂xi | || g εiy −g ε̂iy || || g γxi y −g γ̂xi y ||
E= wg · || g i |+|
| | g ε̂i |+| | (8.1.29)
9· N ε̂x y | | g γ̂ i
xy |
i=1 g=1
where N is the number of training patterns, which is 300,000 in this case, wg the
weight of the numerical integration at the g-th integration point, which is a constant
given as a product of the weights in each axis direction, and g ε̂xi , g ε̂iy and g γ̂xi y the
accurate strain values calculated by the reference model given as Strain(i). g εxi , g εiy
and g γxi y are the strain values calculated from [g B] output by the neural network and
{ i i i i
}
Disp(i) = U 1 , · · · , U 8 , V 1 , · · · , V 8 , obtained by
i
⎞ ⎛
U1
⎜ . ⎟
⎛ ⎞ ⎜ . ⎟
⎜ . ⎟
εx
g i ⎜ i⎟
⎜ g i ⎟ [g ]⎜ U8 ⎟
⎝ εy ⎠ = B ⎜ ⎟
⎜ Vi ⎟ (8.1.30)
γx y
g i ⎜ ⎟
⎜ .1 ⎟
⎜ . ⎟
⎝ . ⎠
i
V8
Now, a feedforward neural network is trained by the error back propagation algo-
rithm to output [g B] so that the correct strain values are obtained at the nine integration
points. Then, the numerical integration is performed to obtain an accurate element
stiffness matrix with [g B] output by the trained neural network above.
The feedforward neural network used here has five hidden layers and 378 units
per hidden layer. In each hidden layer, the batch normalization (see Sect. 1.3.5) is
employed, while ELU (Exponential Linear Unit) as the activation function is used,
which is given by
{
x (x ≥ 0)
f (x) = (8.1.31)
a(e x − 1) (x < 0)
where a is a positive constant.

With 300,000 training patterns and the mini-batch size of 50,000, the Adam (see
Sect. 2.3.3) has been used as the optimization method, setting the number of training
epochs to 30,000.
8.1.2.3 Application Phase
The trained neural network above is used to calculate the stiffness matrix for a new
element with high accuracy. The procedure is summarized as follows:
1. Standard arrangement: Convert a new element to the standard arrangement (See
Sect. 8.1.2.1). [ ]
g
2. Calculation[of OUT ] B : Estimate [g B] using the trained neural network, which is
g
denoted as OUT B . [g ]
3. Calculation of : Since DL B is the matrix for the element converted to the
[standard
g ] configuration, it is re-converted to the matrix for the original element
B .
DL [g ] [g ]
4. Calculation of DL B : Improve DL B by the B-bar method [7, 19] to obtain
[g ]
DL B .
[g ]
5. Calculation of [DL K ]: Calculate the element stiffness matrix [DL K ] using DL B .
Let’s look at each of these steps in order.
Standard arrangement:
The target element is converted to the standard arrangement as shown in Fig. 8.4.
First, four corner nodes of the element are numbered from 1 to 4 in a counter-
clockwise manner. Note the nodes at both ends of the longest edge are numbered 1
and 2, respectively. Then, the element is translated so that the first node is located
at the origin.
Then, α being the angle measured counterclockwise between the longest edge
of the element and the positive direction of the x-axis, the element is rotated
clockwise by α around the origin. Thus, the longest edge is placed along the x-
axis. Finally, the element size is proportionally increased or decreased so that the
length of the longest edge is adjusted to be 1 or the second node is located at (1,0).
When the original coordinates of the i-th node are X i = (X i , Yi )T , the transformed
coordinates input X i are obtained by
1
input X = [R(α)]T (X i − X 1 ) (8.1.32)
lmax
3
y y
2
4
1
x x
0 0
y y
x x
0 1 0
Fig. 8.4 Transformation to normalized element geometry

where lmax is the length of the longest edge, and [R(α)] the matrix representing
the rotation by α counterclockwise around the origin, which is given as
[ ]
cos α − sin α
[R(α)] = (8.1.33)
sin α cos α
[g ]
Calculation of OUT B :
Inputting
[g ] input X 3 , input X 4 and the Poisson’s ratio to the trained neural network,
OUT B at 9 integration points are obtained.
[g ]
Calculation of DL B :
[g ]
Since OUT B is the matrix for the element in the standard
[g ] arrangement, it should
be converted to the matrix for the original element DL B using
[g ] 1 [g ]
DL B = [T ] OUT B [Q]T (8.1.34)
lmax
where [T ] is the transformation matrix of the strain and [Q] the rotation matrix
of the displacements (16 rows by 16 columns), which are, respectively, given by
⎡ ⎤
cos2 α sin2 α − sin α cos α
[T ] = ⎣ sin2 α cos2 α sin α cos α ⎦ (8.1.35)
2 sin α cos α −2 sin α cos α cos2 α − sin2 α
and
[ ]
cos α[I ] − sin α[I ]
[Q] = (8.1.36)
sin α[I ] cos α[I ]
where [I ] is the unit matrix of 8 rows and 8 columns.

[g ]
Calculation of DL B :
[g ]
DL B is calculated using the following equation [19].
[g ] [g ] [ '
]
DL B = DL B + DL B (8.1.37)
with
[ ] 1 Σ ([g ] [g ])||[g ]||

9
'
DL B = wg Q8 B − DL B J (8.1.38)
V g=1
[g ]
where V is the volume (area in this case) of the element, Q8 B is the standard
strain–displacement matrix for the quadratic quadrilateral element, and [g J ] is
the Jacobian matrix.
[ ]
Calculation of DL K :
Finally, the element stiffness matrix is obtained as
Σ
9
[g ]T [g ]|[ ]|
[DL K ] = wg DL B [D] DL B | g J | (8.1.39)
g=1
Let’s discuss an application of this method [9]. The problem is shown in Fig. 8.5.
Tested are the element division, both equal division with square elements as shown
in Fig. 8.6a and that with distorted elements as shown in Fig. 8.6b. The results are
shown in Fig. 8.7, where the horizontal axis is the logarithm of the element size and
the vertical axis the logarithm of the error. From the results of quadratic elements,
it can be seen that the solution obtained by this method (DL8 in the figure) is more
accurate than the normal element (Q8) and the modified element (P183) used in
ANSYS.
Fig. 8.5 Block problem.

Reprinted from [9] with
permission from Elsevier
8.2 FEA-Net 253
Fig. 8.6 Meshes used for block problem: a Regular meshes and b distorted meshes. Reprinted
from [9] with permission from Elsevier
8.2 FEA-Net
Convolutional neural networks are one of the most important key technologies for
deep learning. In this section, FEA-net [18] is studied, where the main operations in
the finite element method are represented by convolution operations and the analysis
is performed with multiple convolution layers.
8.2.1 Finite Element Analysis (FEA) With Convolution
First, we show how the matrix–vector product of the global stiffness matrix and the
displacement vector in the finite element method can be expressed by convolution
operations.
Consider a two-dimensional stress analysis of an object that is evenly divided
into quadrilateral elements of the first order as shown in Fig. 8.8. The finite element
method for this problem is given as
[K ]{U } = { f } (8.2.1)
where [K ] is the global stiffness matrix and {U } the vector of displacements of all
the nodes.
Let the element nodal vector of the e-th quadrilateral element of the first order be
represented as
⎛ ⎞
U1e
⎜ V1e ⎟
{ e} ⎜
⎜ ⎟
⎟
U = ⎜ ... ⎟, (8.2.2)
⎜ ⎟
⎝Ue ⎠
4
V4e
Fig. 8.7 Convergence

curves in the block problem:
a Regular meshes and b
distorted meshes. Reprinted
from Elsevier
8.2 FEA-Net 255
Fig. 8.8 Two-dimensional regular mesh with quadrilateral elements
and the element stiffness matrix of the element as

⎡ e e e e e e e e
⎤
k11 k12 k13 k14 k15 k16 k17 k18
⎢ ke ke ke ke ke ke e
k27 e ⎥
k28
⎢ 21 22 23 24 25 26 ⎥
⎢ ke ke ke ke ke ke e e ⎥
⎢ 31 32 33 34 35 36 k37 k38 ⎥
[ e] ⎢ ⎢ k e k42e e
k43 e
k44 e
k45 e
k46 e
k47 e ⎥
k48 ⎥
k = ⎢ 41 e ⎥ (8.2.3)
⎢ k51 k52 k53 k54 k55 k56
e e e e e e e
k57 k58 ⎥
⎢ e e e e e e ⎥
⎢ k61 k62 k63 k64 k65 k66 e
k67 e ⎥
k68
⎢ e e e e e e ⎥
⎣ k71 k72 k73 k74 k75 k76 e
k77 e ⎦
k78
e e e e e e e e
k81 k82 k83 k84 k85 k86 k87 k88
Figure 8.9 shows a part of Fig. 8.8, where the four elements sharing a node and their
element node numbers are depicted. Similarly, Fig. 8.10 shows the two-dimensional
location index of nodes.
From these two figures, the relations between Ui, j , the displacement in the x-
direction in the two-dimensional configuration, and Uie , that in each element, are
given as
Fig. 8.9 Four elements

sharing a node
4 3 4 3
1 2 1 2
4 3 4 3
1 2 1 2

location index of nodes
8.2 FEA-Net 257
Ui−1, j−1 = U4c

Ui, j−1 = U4 , U1c
a
Ui+1, j−1 = U1a

Ui−1, j = U3 , U4d
c
Ui, j = U3 , U4b , U2c , U1d

a (8.2.4)
Ui+1, j = U2a , U1b
Ui−1, j+1 = U3d
Ui, j+1 = U3 , U2d
b
Ui+1, j+1 = U2b
Similar relations hold for the displacements in the y-direction Vi, j and Vie .
Since the global stiffness matrix is represented by the sum of all the element stiff-
ness matrices [k e ], the matrix–vector product [K ]{U } in Eq. (8.2.1) can be calculated
by evaluating [k e ]{U e } element by element and summing the contributions from each
element. Here, we write the matrix–vector product [K ]{U } as
⎛ . ⎞ ⎛ . ⎞
. ..
⎜ . ⎟ ⎜ U ⎟
⎜ Ui, j ⎟ ⎜ gi, j ⎟
[K ]⎜ ⎟ ⎜
⎜ Vi, j ⎟ = ⎜ g V ⎟
⎟ (8.2.5)
⎝ ⎠ ⎝ i, j ⎠
.. ..
. .
The element-wise matrix–vector product [k e ]{U e } can also be expressed as

⎛ 1 e ⎞
gu
⎜ 1 e
gv ⎟
[ e ]{ e } ⎜
⎜ ..
⎟
⎟
k U =⎜ . ⎟ (8.2.6)
⎜ ⎟
⎝ 4 e
gu ⎠
4 e
gv
Then, from Eqs. (8.2.2) and (8.2.3), we have
gi,U j =3 gua +4 gub +2 guc +1 gud

(8.2.7)
gi,V j =3 gva +4 gvb +2 gvc +1 gvd
Using Eq. (8.2.6) to calculate each term on the right-hand side of gi,U j , and then
rearranging based on nodal relations as Eq. (8.2.6), the following equations are
obtained.
3 a
gu = k51
a
U1a + k52a
V1a + k53a
U2a + k54a
V2a + k55
a
U3a + k56
a
V3a + k57
a
U4a + k58
a
V4a
= k51 Ui+1, j−1 + k52 Vi+1, j−1 + k53 Ui+1, j + k54 Vi+1, j
a a a a
+ k55a
Ui, j + k56
a
Vi, j + k57
a
Ui, j−1 + k58
a
Vi, j−1 (8.2.8)
4 b
gu = k71
b
U1b + k72
b
V1b + k73
b
U2b + k74
b
V2b + k75
b
U3b + k76
b
V3b + k77
b
U4b + k78
b
V4b
= k71
b
Ui+1, j + k72
b
Vi+1, j + k73
b
Ui+1, j+1 + k74
b
Vi+1, j+1 + k75
b
Ui, j+1
+ k76
b
Vi, j+1 + k77
b
Ui, j + k78
b
Vi, j (8.2.9)
2 c
gu = k31
c
U1c + k32
c
V1c + k33
c
U2c + k34
c
V2c + k35
c
U3c + k36
c
V3c + k37
c
U4c + k38
c
V4c
= k31
c
Ui, j−1 + k32
c
Vi, j−1 + k33
c
Ui, j + k34
c
Vi, j + k35
c
Ui−1, j + k36
c
Vi−1, j
+ k37 Ui−1, j−1 + k38 Vi−1, j−1
c c
(8.2.10)
1 d
gu = k11
d
U1d + k12
d
V1d + k13
d
U2d + k14
d
V2d + k15
d
U3d + k16
d
V3d + k17
d
U4d + k18
d
V4d
= k11
d
Ui, j + k12
d
Vi, j + k13
d
Ui, j+1 + k14
d
Vi, j+1 + k15
d
Ui−1, j+1
+ k16
d
Vi−1, j+1 + k17
d
Ui−1, j + k18
d
Vi−1, j (8.2.11)
Summing up Eqs. (8.2.8) to (8.2.11) and rearranging by Ui, j and Vi, j , gi,U j can be
expressed as
( c ) ( a )
gi,U j = k37
c
Ui−1, j−1 + k35 + k17
d
Ui−1, j + k15
d
Ui−1, j+1 + k57 + k31
c
Ui, j−1
( a ) ( b )
+ k55 + k77 + k33 + k11 Ui, j + k75 + k13 Ui, j+1
b c d d
( a )
+ k51a
Ui+1, j−1 + k53 + k71b
Ui+1, j + k73b
Ui+1, j+1
( ) ( a )
+ k38 Vi−1, j−1 + k36 + k18 Vi−1, j + k16 Vi−1, j+1 + k58
c c d d
+ k32c
Vi, j−1
( a ) ( b )
+ k56 + k78 + k34 + k12 Vi, j + k76 + k14 Vi, j+1
b c d d
( a )
+ k52a
Vi+1, j−1 + k54 + k72
b
Vi+1, j + k74
b
Vi+1, j+1 (8.2.12)
Equation (8.51) is further rearranged to

⎡ ⎤ ⎡ ⎤
][ Ui−1, j−1 Ui−1, j Ui−1, j+1 [ ] Vi−1, j−1 Vi−1, j Vi−1, j+1
gi,U j = WUU ⎣ Ui, j−1 Ui, j Ui, j+1 ⎦ + WVU ⎣ Vi, j−1 Vi, j Vi, j+1 ⎦
Ui+1, j−1 Ui+1, j Ui+1, j+1 Vi+1, j−1 Vi+1, j Vi+1, j+1
(8.2.12)
[ ] [ ]
where WUU and WVU are, respectively, given as follows:
⎡ ⎤
[ ] k37 k35 + k17 k15
WU = ⎣ k57 + k31 k55 + k77 + k33 + k11 k75 + k13 ⎦
U
(8.2.14)
k51 k53 + k71 k73
⎡ ⎤
[ U] k38 k36 + k18 k16
WV = ⎣ k58 + k32 k56 + k78 + k34 + k12 k76 + k14 ⎦ (8.2.15)
k52 k54 + k72 k74
8.2 FEA-Net 259
and denotes a convolution operation, which is defined as

⎡ ⎤ ⎡ ⎤
a11 a12 a13 b11 b12 b13 Σ3 Σ 3
⎣ a21 a22 a23 ⎦⎣ b21 b22 b23 ⎦ = ai j bi j (8.2.16)
a31 a32 a33 b31 b32 b33 i=1 j=1
Note that the element stiffness matrix of each element is the same for an equally
divided mesh, so the superscripts of the element numbers are omitted in Eqs. (8.2.14)
and (8.2.15).
Similarly, gi,V j is expressed by the following equation.
⎡ ⎤ ⎡ ⎤
[ ] Ui−1, j−1 Ui−1, j Ui−1, j+1 [ V] Vi−1, j−1 Vi−1, j Vi−1, j+1
gi,V j = WUV
⎣ Ui, j−1 Ui, j Ui, j+1 ⎦ + WV ⎣ Vi, j−1 Vi, j Vi, j+1 ⎦
Ui+1, j−1 Ui+1, j Ui+1, j+1 Vi+1, j−1 Vi+1, j Vi+1, j+1
(8.2.17)
[ ] [ ]
where WUV and WVV are, respectively, given as
⎡ ⎤
[ ] k47 k45 + k27 k25
WU = ⎣ k67 + k41 k65 + k87 + k43 + k21 k85 + k23 ⎦
V
(8.2.18)
k61 k63 + k81 k83
⎡ ⎤
[ V] k48 k46 + k28 k26
WV = ⎣ k68 + k42 k66 + k88 + k44 + k22 k86 + k24 ⎦ (8.2.19)
k62 k64 + k82 k84
As described above, the matrix–vector product of the global stiffness matrix and
the nodal displacement vector [K ]{U }, which appears in the finite element analysis,
] calculated by a convolution operation using some 3 × 3 matrices such as
[canUbe
WU as a filter. In the original article [18], this convolution operation is called
FEA-Convolution.
8.2.2 FEA-Net Based on FEA-Convolution
As shown in Sect. 8.2.1, the stress analysis by the finite element method is equivalent
to solving the following simultaneous linear equations.
[K ]{U } = { f } (8.2.20)
To solve Eq. (8.2.20), an iterative method, a method to minimize the residual

iteratively, can be employed [3], where candidate solutions are improved iteratively
{ } { } { } { } { }
as U (0) , U (1) , U (2) , · · · . The residual r (n) for U (n) is defined as follows:
{ (n) } { }
r = { f } − [K ] U (n) (8.2.21)
Among the iterative methods, the Jacobi method is considered one of the most
basic ones ] where the coefficient matrix [K[ ] is divided
[ [8], ] into the diagonal-only
matrix K D and the off-diagonal-only matrix K N D as
[ ] [ ]
[K ] = K D + K N D (8.2.22)
Substituting Eq. (8.2.22) into Eq. (8.2.20), we obtain

([ ] [ ])
K D + K N D {U } = { f } (8.2.23)
Based on Eq. (8.2.23), a recurrence formula is achieved as follows:

{ (n+1) } [ D ]−1 ( [ ]{ })
U = K { f } − K N D U (n) (8.2.24)
Based on Eqs. (8.2.21), (8.2.24) is rearranged as

{ (n+1) } { (n) } [ D ]−1 { (n) }
U = U + K r (8.2.25)
{ }
As Eqs. (8.2.21) and (8.2.25) are iterated, U (n) will converge to the correct
[ ] [ ]−1
solution. Note that, since K D is a diagonal matrix, multiplying by K D in Eq.
(8.2.25) is a simple division process.
In the iterative solution {method
} (e.g., the Jacobi method), the calculation of the
matrix–vector product [K ] U (n) in Eq. (8.2.21) is the most computationally inten-
sive process. This can be performed using the FEA-Convolution (FEA-Conv), which
has been studied in Sect. 8.2.1. In other words, the above iterative solution method
is an operation that repeats convolution operations similar to convolutional neural
networks (CNNs) with many convolution layers. For this reason, this operation is
called FEA-Net in the original article [18].
The structure of FEA-Net is shown in Fig. 8.11. It can be regarded as a multilayer
convolutional neural network with {the right-hand
} side vector { f } as the input data
and the nodal displacement vector U (N ) as the output data. The{ actual} input and
(N )
{output} data are both two-dimensional images, and, specifically, U (and also
U (n) during the iteration) consists of two images: one is a two-dimensional image
of the x-direction displacements and the other that of the y-direction displacements.
Since FEA-Net basically performs the same process as the iterative solution
method, it requires the same number of layers as the number of iterations in the
iterative solution method. For this reason, the number of layers is much larger than
that of an ordinary CNN.
8.2 FEA-Net 261
Input
FEA-Conv
Output
Fig. 8.11 FEA-Net
8.2.3 Numerical Example
A numerical example of using FEA-net in a two-dimensional stress analysis is shown

in what follows.
Data Preparation Phase:
Figure 8.12 shows the training and testing data. Figure 8.12a gives the training data,
where the top row is the load image as the input data, and the middle and bottom
rows the displacement images as the teacher data (output data). The x-direction load
is applied to the white area in the load image. The total number of training data is 4.
In addition to the four training data, two data with the same resolution as the training
data (Fig. 8.12b) and two data with higher resolution (Fig. 8.12c) are prepared for
verification.
Fig. 8.12 Training and test data for FEA-Net. Reprinted from [18] with permission from Elsevier
Training Phase:
Learning in FEA-Net is not performed on such a whole network as that in Fig. 8.11,
but defined to be the optimization of the filters in the convolution layers that are
the building blocks of the FEA-net. For this reason, a network with one convolution
layer, which can be called FEA-Conv network, is trained using the input data as the
displacement image and the teacher data as the load image. After the training of
the FEA-Conv network, the FEA-Net is constructed using the obtained filters. The
number of layers of the FEA-Net used in this example is 5000.
For comparison, a general CNN consisting of seven convolution layers are also
trained, where a load image is used as the input data and two displacement images
as teacher data.
Application Phase:
Figure 8.13 shows the estimated results for the validation data using the trained FEA-
Net and those using the trained CNN for comparison. The results from the ordinary
finite element analysis are also shown as the reference response. It can be seen that
FEA-Net estimates well the displacement field more accurately than ordinary CNNs.
8.3 DiscretizationNet
In Chap. 7, a method to predict an unsteady flow field using deep learning has been
studied. In this section, another example applying deep learning to a flow field is
taken, which is DiscretizationNet [13], an application of a new deep learning model
called the generative model (Sect. 1.3.7) to the fluid analysis.
8.3.1 DiscretizationNet Based on Conditional Variational

Autoencoder
The continuity equation and the Navier–Stokes equation (dimensionless) for the
steady-state flow field are known to be, respectively, written as follows:
8.3 DiscretizationNet 263
Fig. 8.13 Prediction of displacement field using FEA-Net. Reprinted from [18] with permission
from Elsevier
∇ ·v =0 (8.3.1)
1 2
(v · ∇)v + ∇ p − ∇ v=0 (8.3.2)
Re
where v and p are the non-dimensionalized velocity vector and pressure, respectively,
and Re is the Reynolds number.
The deep learning model used in this section is named DiscretizationNet [13],
the conceptual diagram of which is shown in Fig. 8.14. This is an application of the
conditional variational autoencoder [10, 15], which is one of the generative models.
Training processes in DiscretizationNet are summarized as follows:
(T1) The information on the shape of analysis domain is denoted by h, which is
expressed by the level set method and takes the values 0 for the inside of the
shape and 1 for the outside. The Geometry autoencoder shown in Fig. 8.15
outputs ĥ when h is input, and the compressed information ηh of h is obtained
when trained to reproduce h with setting ĥ = h.
(T2) The information on the boundary condition is denoted by b, and the boundary
autoencoder shown in Fig. 8.16 outputs b̂ when b is input, and the compressed
information ηb of b is obtained when trained to reproduce b with setting
b̂ = b. The boundary autoencoder is unnecessary for boundary conditions
not changing with time and space.
If a boundary condition is set at each side of a quadrilateral region as shown
in Fig. 8.17, and the boundary condition does not change with space and time,
Fig. 8.14 Schematic diagram of DiscretizationNet for Navier–Stokes solution. Reprinted from [13]
with permission from Elsevier
Encoder Decoder
Fig. 8.15 Geometry autoencoder
Encoder Decoder
Fig. 8.16 Boundary autoencoder

Fig. 8.17 Sample boundary Neumann B. C.

condition
0.0
0.3 1.2
3.0
Dirichret B. C.
then ηb can be simply defined to be ηb = {1, 1, 2, 1, 0.3, 1.2, 0.0, 3.0, 40} for
example, where the last value of 40 is the Reynolds number.
(T3) The velocity vectors and pressures u, v, w and p are initialized with random
numbers.
(T4) u, v, w and p are input to the CNN encoder and the compressed information
η is obtained.
(T5) ηh , ηb and η are input to the CNN decoder and û, v̂, ŵ and p̂ are obtained as
output.
(T6) Using û, v̂, ŵ, p̂,h and b, the residual L train in Eqs. (8.3.1) and (8.3.2) is calcu-
lated. If û, v̂, ŵ and p̂ are correct, the residual will be zero. The residual L train
is expressed by the following equation.
L train = ∥RC ∥ + ∥Ru ∥ + ∥Rv ∥ + ∥Rw ∥ (8.3.3)
where RC is the residual of the equation of continuity (Eq. (8.3.1)) and

Ru , Rv , and Rw are, respectively, the residuals of the Navier–Stokes equations
(Eq. 8.3.2).
(T7) If L train is small enough, go to exit. If not, go to (T8).
(T8) The parameters of the encoder and the decoder are updated to minimize the
L train by the error back propagation algorithm.
(T9) Go to (T4) with û → u, v̂ → v, ŵ → w and p̂ → p.
Fig. 8.18 Schematic diagram of DiscretizationNet for new geometry and boundary conditions.
Reprinted from [13] with permission from Elsevier
In the processes above, û = u, v̂ = v, ŵ = w, and p̂ = p hold at convergence,

which is the behavior of an autoencoder, where the input and output are the same.
After the training is completed, we apply the trained DiscretizationNet to new
boundary conditions and new geometries of analysis domains as follows (Fig. 8.18):
NEW
(A1) The geometry of the new analysis domain h is input to the geometry
encoder to obtain the compressed information ηhNEW . Similarly, the new
NEW
boundary condition b is input to the Boundary encoder to obtain the
compressed information ηbNEW .
(A2) The compressed information ηNEW of the solution in the new analysis is
initialized with a random number.
(A3) ηhNEW , ηbNEW and ηNEW are input to the CNN decoder of the trained Discretiza-
tionNet, and û NEW , v̂ NEW , ŵ NEW and p̂ NEW are obtained as the output.
(A4) û NEW , v̂ NEW , ŵ NEW and p̂ NEW are input to the CNN encoder of the trained
DiscretizationNet, and η̂NEW is obtained as the output.
(A5) Using η̂NEW and ηNEW , the residual L Appli is calculated, which is defined by
∥ ∥
L Appli = ∥η̂NEW − ηNEW ∥ (8.3.4)
(A6) If L Appli is small enough, finish with the solutions û NEW , v̂ NEW , ŵ NEW
and p̂ NEW obtained as the output of the CNN decoder of the trained
DiscretizationNet for ηhNEW , ηbNEW and η̂NEW as input. If not, go to (A7).
(A7) Return to (A3) with η̂NEW → ηNEW .
Note that ηhNEW , ηbNEW and the parameters of CNN encoder and decoder of Discretiza-
tionNet are fixed in the process from (A3) to (A7). Note also that the iterations from
(A3) to (A7) will converge within 10 iterations for the well-trained DiscretizationNet
[13].
DiscretizationNet is numerically tested here. The analysis target is shown in Fig. 8.19,
where the flow field from left to right is analyzed, and the circular cylinder serves
as an obstacle. The boundary conditions are the flow velocity at the inlet and the
pressure at the outlet. Though a similar flow field has been treated in Chap. 7, where
the analysis has been done focusing on the unsteady turbulent flow phenomena due
to the high Reynolds number, while this analysis focuses on the steady analysis of the
laminar flow phenomena due to the low Reynolds number. The number of elements
in the figure is that of the finite volume method [2, 6] employed for comparison.
Analyses are performed for comparison for five different velocity inlets (0.2, 0.4,
0.6, 0.8, and 1.0) and three different Reynolds numbers (10, 20, and 40).
Training Phase:
A single DiscretizationNet is constructed by the training (T1) to (T9) above for 15
combinations of velocity inlets (0.2, 0.4, 0.6, 0.8, and 1.0) and Reynolds numbers
(10, 20, and 40).
The DiscretizationNet constructed here consists of three convolution layers for
both CNN encoder and CNN decoder, and 64 filters are used in each convolution
layer. The number of training epochs is set to 30,000, and a GPU (NVIDIA Tesla
V100 SXM2) is used for training.
The trained DiscretizationNet is considered to give good results comparable
to those of the comparison analysis (using ANSYS Fluent R19.3) for all 15
conditions.
Application Phase:
The trained DiscretizationNet is applied to new boundary conditions not included
in the training data, where the velocity inlet is set to 0.5 and the Reynolds number
4 Elements
Cylinder
Velocity Pressure
inlet 128 Elements outlet
320 Elements
Fig. 8.19 Analysis domain and boundary conditions

is selected from 10, 20, and 40. For each of these new boundary conditions, the
inference process from (A1) to (A7) above is performed to obtain the flow field.
The results are shown in Fig. 8.20, where the left column is the results of
the velocity field estimation by DiscretizationNet, the middle column those calcu-
lated using ANSYS Fluent, and the right column the error in the estimation by
DiscretizationNet (the difference from the results by ANSYS). It is concluded that
the estimation by DiscretizationNet is very accurate.
Fig. 8.20 Velocity magnitude at different Reynolds numbers. (A-1) DiscretizationNet, (A-2)
ANSYS Fluent, (A-3) Difference between (A-1) and (A-2). Reprinted from [13] with permission
from Elsevier
8.4 Zooming Method for Finite Element Analysis 269
8.4 Zooming Method for Finite Element Analysis
It is known that, in composite materials such as CFRP (carbon fiber reinforced

plastic composites), a damage analysis taking into account the fibers and resins in
the material results in an extremely large-scale computing.
In order to solve this issue, the zooming method [5, 11] has been developed,
where the entire analysis domain is analyzed with a coarse mesh, while the part
of interest is analyzed with a fine mesh using the results of the coarse analysis as
boundary conditions. The total computational load is expected to be reduced by using
the above method. In this section, a new method for improving the accuracy of the
zooming method by deep learning is discussed.
8.4.1 Zooming Method for FEA Using Neural Network
The schematic diagram of a zooming method is shown in Fig. 8.21, where the original
analysis geometry has fillets at the corners where stresses would be concentrated,
and a mesh accurately reproduce the fillet part with small elements is called a global
fine model. Since the computational load of this global fine model is very high, the
original analysis geometry is divided into two parts to reduce the computational load:
a global coarse model and a local fine model.
In the global coarse model, the analysis domain is divided by large elements with
no regard to the fillet area, and a simplified material model is used. On the other hand,
in the local fine model, only the fillet and its vicinity are divided by fine elements
and a detailed material model is used. First, the global coarse model is analyzed, and
then using its results (displacements) as a boundary condition, the local fine model
is analyzed to obtain the solution.
There remains a problem to be considered when using the results (displacements)
of the global coarse model analysis as the boundary conditions for the local fine
model analysis as discussed below. Figure 8.22 shows a region around the local fine
model. The displacements at each node of the global coarse model are obtained from
the analysis of the global coarse model. In the zooming method, the displacements
are used for the boundary conditions (displacements) at the nodes on the periphery
of the local fine model.
If the nodes of the local fine model are at the periphery or inside of the global
coarse model, the displacements of the nodes of the local fine model can be obtained
by interpolating using the basis functions of the global coarse model. However, as
shown in Fig. 8.22, some nodes of the local fine model exist outside the global coarse
model, where the displacements at these nodes cannot be obtained by interpolation
using the basis functions of the global coarse model.
For each node of the local fine model outside the global coarse model, its nearest
element of the global coarse model is searched and the displacements of the node are
obtained by extrapolation using the displacements of nodes of the element with the
Global Model
Original Model
(Whole Model)
Local Model
Fig. 8.21 Global and local finite element models
Nodes on exterior surface of local model

(located in global model)
Nodes on exterior surface of local model

(located out of global model)
Nodes of global model
Fig. 8.22 Nodes in global and local models

basis functions of the element. But, this method is known to degrade the accuracy of
extrapolated displacements used in the local fine model.
To solve this problem, a method for estimating nodal displacements of local fine
model nodes outside the global coarse model using a feedforward neural network is
proposed [17].
The zooming method using feedforward neural networks above consists of the
following three phases.
Finite element analysis using the global coarse model is performed for a given
analysis condition. From the each analysis result on the i-th node of the global
coarse model that( is within the) region of the local fine model, ( aG pairG of Gthe
)
node{(coordinates )X(iG , YiG , Z iG and)} the nodal displacements Ui , Vi , Wi ,
i.e. X iG , YiG , Z iG , UiG , ViG , WiG , is obtained. In this manner, a lot of data
pairs are collected.
Training Phase:
A feedforward neural network is trained using the data pairs collected in the Data
Preparation Phase, where input and teacher data for the neural network are set as
follows:
( )
Input data : X iG , YiG , Z iG
( )
Teacher data : UiG , ViG , WiG
Application Phase:
( )
The coordinates X iL , YiL , Z iL of a node in the local fine model
( are input) to the
trained feedforward neural network, and the displacements UiL , Vi L , WiL at the
node are obtained as the output, and then the local fine model is analyzed using
these as the boundary conditions.
A numerical example is shown here to verify the effectiveness of the method

described above. Figure 8.23 shows the analysis domain, where fillets of ten different
sizes (R = 0.1, 0.2, · · · , 1.0) are tested. The material model is an isotropic elastic
body, and the hexahedral elements of the first order are used. The size of the element
in the global coarse model is 0.5 mm, and that of the local fine model is 0.2 mm.
By the analysis using the global coarse model, nodal displacements are calculated
for 1729 nodes in the region encompassing the local fine model, and 1,729 data
Fig. 8.23 Analysis domain. Reprinted from [17] with permission from Elsevier
{( ) ( )}
pairs X iG , YiG , Z iG , UiG , ViG , WiG are obtained. Of these, 70% are used as the
training data and the rest as the data for verification of the generalization capability.
Training Phase:
The training data collected in the Data Preparation Phase are used to train the feed-
forward neural network. The structure of the feedforward neural network employed
is given as follows:
( )
Input layer: 3 units for X iG , YiG , Z iG .
Hidden layers: Several structures are tested: 1, 2, 3, 4 or 5 layers, and 10,50 and
100 units per hidden layer.
( )
Output layer: 3 units for UiG , ViG , WiG .
Figure 8.24 shows the training results. The horizontal axis is the number of layers
in the hidden layer, and the vertical axis the error for the validation data. From the
figure, it is seen that the error is the lowest when using three hidden layers with the
ReLU function.
Application Phase:
The trained feedforward neural network with three hidden layers and ReLU as
the activation function is used to determine the boundary condition (displacement)
of the local fine model. The performance of this method is evaluated by the value
of the von Mises stress obtained from the analysis of the local fine model. The von
Mises stress is an index often used in the strength analysis and given as follows [1,
14]:
Fig. 8.24 Effects of network hyperparameters. Reprinted from [17] with permission from Elsevier
√ {
1 ( )2 ( )2 ( )}
σMises = σx x − σ yy + σ yy − σzz + (σzz − σx x )2 + 6 τx2y + τ yz
2 + τ2
zx
2
(8.4.1)
First, the von Mises stress values are calculated by the zooming method using the
feedforward neural network with the procedure as follows:
( )
(A1) Input the coordinates X iL , YiL , Z iL of the periphery
( node )of the local fine
model to the trained neural network, and let UiL , ViL , WiL obtained as the
output of the neural network to be the boundary condition (fixed displacement)
of the local fine model.
(A2) Perform the analysis of the local fine model and calculate the von Mises stress
at each node of the local fine model.
(A3) Find the maximum value among the calculated von Mises stresses and set it
as σMises
A
.
For comparison, the maximum value of the von Mises stresses is calculated also
by another procedure as follows:
( )
(B1) For the coordinates X iL , YiL , Z iL of the outer periphery node of the local fine
(model, identify) the nearest element of the global coarse model, and extrapolate
UiL , Vi L , WiL from the displacements at the nodes of the element using the
shape (basis) function of the global coarse model, and set it as the boundary
condition (fixed displacement) of the local fine model.
(B2) Perform the analysis of the local fine model and calculate the von Mises stress
at each node of the local fine model.
(B3) Find the maximum value among the calculated von Mises stresses and set it
as σMises
B
.
In addition, the maximum value of the von Mises stresses is calculated by another
procedure as follows:
(C1) Create a global fine model dividing the entire region including the fillet into
elements with the same fineness as the local fine model.
(C2) Perform the analysis of the global fine model above and calculate the von
Mises stress at each node in the same region as the local fine model.
(C3) Find the maximum value among the calculated Mises stresses and set it as
σMises
C
. This value is considered to be close to the correct one.
The performance of the present method is evaluated from the comparison among
the three von Mises stress values σMises
A
, σMises
B
and σMises
C
.
The results for various radii of curvature of the fillet are shown in Fig. 8.25. The
horizontal axis is the radius of curvature of the fillet, and the vertical axis is the
maximum von Mises stress values. The results clearly show that σMises C
≈ σMises
A
<
σMises , indicating that the zooming method using the feedforward neural network is
B
more accurate than that based on the shape (basis) functions.

Using this method, a large-scale CFRP strength analysis with more than 50 million
degrees of freedom has been successfully performed [17]. In addition, an attempt
has also been made to estimate the stress at the fillet directly by using a feedforward
neural network [16].
Fig. 8.25 Performance of zooming method versus radius of fillet. Reprinted from [17] with
8.5 Physics-Informed Neural Network 275
8.5 Physics-Informed Neural Network
In this section, an application of the physics-informed neural network [12] (see

Sect. 2.4.3) to the inverse analysis and the surrogate model in solid mechanics is
discussed.
8.5.1 Application of Physics-Informed Neural Network

to Solid Mechanics
Consider a two-dimensional stress analysis of a solid. The basic equations of two-

dimensional stress analysis are given with displacements u(x, y) and v(x, y), strains
εx (x, y), ε y (x, y) and γx y (x, y), stresses σx (x, y), σ y (x, y) and τx y (x, y), body
forces f x (x, y) and f y (x, y), and external loads T x (x, y) and T y (x, y), as follows:
Equations for the balance of forces in the region Ω:
{ ∂τx y
∂σx
∂x
+ ∂y
+ fx =0
∂τx y ∂σ y in Ω (8.5.1)
∂x
+ ∂y
+ fy =0
Equations for the balance of forces on the load boundary ┌σ :

{
σx n x + τx y n y = T x
on ┌σ (8.5.2)
τx y n x + σ y n y = T y
Equations for the displacements at the fixed boundary ┌u :

{
u=u
on ┌u (8.5.3)
v=v
Strain–displacement equation:
⎛ ⎞ ⎡ ∂ ⎤
εx ∂x
0 ( )
⎝ εy ⎠ = ⎢ ∂ ⎥ u
⎣ 0 ∂y ⎦ (8.5.4)
∂ ∂ v
γx y ∂y ∂x
Stress–strain equation (constitutive equation of elastic body):

⎛ ⎞ ⎡ ⎤⎛ ⎞
σx λ + 2μ λ 0 εx
⎝ σy ⎠ = ⎣ λ λ + 2μ 0 ⎦⎝ ε y ⎠ (8.5.5)
τx y 0 0 μ γx y
where λ and μ are the Lame’s constants, which have the following relationship with
the Young’s modulus E and the Poisson’s ratio ν.
μ(3λ + 2μ) λ
E= , ν= (8.5.6)
λ+μ 2(λ + μ)
Now, consider the analysis domain and boundary conditions as shown in Fig. 8.26
[4]. The external load is given as follows:
T x (x, y) = 0 (on ┌σ ) (8.5.7)
{
(λ + 2μ)Q sin(π x) (on y = 1)
T y (x, y) = (8.5.8)
0 (elsewhere)
And the body forces are

{ }
f x (x, y) = λ 4π 2 cos(2π x) sin(π y) − π cos(π x)Qy 3
{ }
+ μ 9π 2 cos(2π x) sin(π y) − π cos(π x)Qy 3 (8.5.9)
{ }
f y (x, y) = λ −3 sin(π x)Qy 2 + 2π 2 sin(2π x) cos(π y)
{ }
1 2
+ μ −6 sin(π x)Qy + 2π sin(2π x) cos(π y) + π sin(π x)Qy
2 2 4
4
(8.5.10)
The analytical solution for the above problem is known as

stress analysis
u(x, y) = cos(2π x) sin(π y) (8.5.11)
1
v(x, y) = sin(π x)Qy 4 (8.5.12)
4
Figure 8.27 illustrates the solution for λ = 1.0, μ = 0.5 and Q = 4.0.
Prepare a large number of sampling points in the analysis domain, and calculate
the coordinates x ∗ and y ∗ , displacements u ∗ and v ∗ , and stresses σx∗ , σ y∗ and τx∗y , at
each sampling point. For the data obtained in this way, a physics-informed neural
network is constructed as follows [4]:
Input data: Coordinates of a sampling point inside the region, x ∗ and y ∗ .
Teacher data: Displacements u ∗ and v ∗ and stresses σx∗ , σ y∗ and τx∗y at the sampling
point.
Fig. 8.27 Solution for parameter values of λ = 1.0, μ = 0.5, Q = 4.0. Reprinted from [4] with
With L1 , the error in a normal neural network, L2 , the error specific to the physics-
informed neural network, which is related to the equilibrium equations, and L3 , which
is related to the setup and configuration equation, the error function L is defined as
L = L1 + L2 + L3 (8.5.13)
where
| | | | | | | | | |
L1 = |u N N − u ∗ | + |v N N − v ∗ | + |σxN N − σx∗ | + |σ yN N − σ y∗ | + |τxNy N − τx∗y |
(8.5.14)
| | | |
| ∂σ N N ∂τxNy N | | ∂τ N N ∂σ yN N |
| | | xy |
L2 = | x + + f x∗ | + | + + f y∗ | (8.5.15)
| ∂x ∂y | | ∂x ∂y |
| |
L3 = |(λ + 2μ)εxN N + λε yN N − σxN N |
| |
+ |(λ + 2μ)ε yN N + λεxN N − σ yN N |
| |
+ |μγxNy N − τxNy N |a (8.5.16)
Note that () N N means the output of the neural network, while εxN N , ε yN N and
γxNy Nare not the output of the neural network but are quantities calculated from the
displacements u N N and v N N based on Eq. (8.5.4) as
∂u N N ∂v N N ∂u N N ∂v N N
εxN N = , ε yN N = , γxNy N = + (8.5.17)
∂x ∂y ∂y ∂x
The derivative values appearing in Eqs. (8.5.15) and (8.5.17) are those of the
output of the neural network with respect to the input value, which are obtained by the
automatic differentiation implemented in the deep learning library (see Sect. 1.3.8).
If highly accurate values of f x∗ and f y∗ in Eq. (8.5.15) are not available at the
sampling points, they are alternatively obtained based on Eq. (8.5.1) as follows:
∂σx∗ ∂τx∗y
f x∗ = − −
∂x ∂y
∗ ∗
∂τ ∂σ
f y∗ = −
xy y
− (8.5.18)
∂x ∂y
The derivatives of right-hand sides of Eq. (8.5.18) can be calculated by using the
central difference approximation (see Sect. 7.2) for σx∗ , σ y∗ and τx∗y .
The identification of the material constants λ and μ using a physics-informed neural

network has been performed for the sample problem in Sect. 8.5.1 [4]. Here, some
of their results are reviewed.
Place 100 × 100 sampling points in a grid pattern in the analysis domain and calculate
the coordinates x ∗ and y ∗ , displacements u ∗ and v ∗ , and stresses σx∗ , σ y∗ and τx∗y , at
each sampling point for λ = 1.0, μ = 0.5 and Q = 4.0.
Training Phase:
A neural network as shown in Fig. 8.28 is constructed. The material constants λ and μ
are treated as parameters, and their values are updated by the error back propagation
learning.
Here, two types of neural networks are tested as follows:
Single network: A single neural network that simultaneously outputs displace-
ments u N N and v N N , and stresses σxN N , σ yN N , and τxNy N .
Independent networks: These consists of five neural networks, each of which
outputs one among five parameters: two displacements u N N and v N N and three
stresses σxN N , σ yN N , and τxNy N .
As the latter networks have given better results, they are employed and their
diagrams are shown in Fig. 8.28.
Fig. 8.28 Physics-informed neural network for parameter identification. Reprinted from [4] with
Fig. 8.29 Surrogate modeling using physics-informed neural network. Reprinted from [4] with
In addition, several combinations of the number of hidden layers and the number
of units per hidden layer are tested, showing that the combination of 5 hidden layers
and 50 units per hidden layer results in relatively good results. Note that tanh() is
used for the activation function here.
Application Phase:
As for the identification of the material constants λ and μ, the correct values of
λ = 1.0 and μ = 0.5 are obtained with very few epochs.
A comparison is also made employing a finite element solution instead of an
analytical solution. Dividing the analysis domain into 40 × 40 elements, solutions
using Lagrange elements of the first to fourth order are tested, and good results are
obtained for Lagrange elements of the second and higher orders.
Transfer learning has also been evaluated, and it has been confirmed that a neural
network that has completed training for λ = 1.0 and μ = 0.5 converges (comes
to be able to estimate the correct μ value) faster than learning from scratch when
performing training for patterns with different μ values.
Based on the above results, another physics-informed neural network has been
constructed. This network takes the coordinates x ∗ and y ∗ , and the material parameter
μ, as input, and outputs the displacements u N N and v N N , and the stresses σxN N , σ yN N ,
and τxNy N . The error function is the same as that in Eq. (8.1.13). λ = 1.0 is treated as
a constant. Patterns created for four different conditions, μ = 1/4, 2/3, 3/2 and 4,
are used for training.
The estimation accuracy of the trained neural network is evaluated for input of
various μ values including untrained ones. The results are shown in Fig. 8.29. Natu-
rally, the accuracy of estimation is high for μ = 1/4, 2/3, 3/2 and 4, which are
References 281
included in the training patterns, but it is seen that acceptable estimation is also
possible for other μ values not used in training, indicating that the physics-informed
neural network can be used as a kind of surrogate model.
References
1. Akin, J.E.: Finite Elements for Analysis and Design. Academic Press (1994).
2. Ferziger, J.H., Peric, M.: Computational Methods for Fluid Dynamics (Second Edition).
Springer (1999).
3. Golub, G.H., Van Loan, C.F.: Matrix Computations (Third Edition). The Johns Hopkins
University Press (1996).
4. Haghighat, E., Raissi, M., Moure, A., Gomez, H., Juanes, R.: A physics-informed deep learning
framework for inversion and surrogate modeling in solid mechanics. Comput. Methods Appl.
Mech. Eng. 379, 113741 (2021). https://doi.org/10.1016/j.cma.2021.113741.
5. Hirai, I., Wang, B.P., Pilkey, W.D.: An efficient zooming method for finite element analysis.
Int. J. Numer. Methods Eng. 20, 1671–1683 (1984). https://doi.org/10.1002/nme.1620200910.
6. Hirsh, C.: Numerical Computation of Internal and External Flows: The Fundamentals of
Computational Fluid Dynamics (Second Edition). Butterworth-Heinemann (2007).
7. Hughes, T.J.R.: The Finite Element Method: Linear Static and Dynamic Finite Element
Analysis. Dover (2000).
8. Jennings, A., McKeown, J.J.: Matrix Computations (Second Edition). John Wiley & Sons
(1992).
9. Jung, J., Yoon, K., Lee, P.-S.: Deep learned finite elements. Comput. Methods Appl. Mech.
Eng. 372, 113401 (2020). https://doi.org/10.1016/j.cma.2020.113401.
10. Kingma, D.P., Rezende, D.J., Mohamed, S., Welling, M.: Semi-supervised learning with deep
generative models. Adv. Neural Inf. Process. Sys. 27, 3581–3589 (2014).
11. Mao, K.M., Sun, C.T.: A refined global-local finite element analysis method. Int. J. Numer.
Methods Eng. 32, 29–43 (1991). https://doi.org/10.1002/nme.1620320103.
12. Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep
learning framework for solving forward and inverse problems involving nonlinear partial
differential equations. J. Comput. Phys. 378, 686–707 (2019).
13. Ranade, R., Hill, C., Pathak, J.: DiscretizationNet: A machine-learning based solver for Navier–
Stokes equations using finite volume discretization. Comput. Methods Appl. Mech. Eng. 378,
113722 (2021). https://doi.org/10.1016/j.cma.2021.113722.
14. Simo, J.C., Hughes, T.J.R.: Computational Inelasticity. Springer (1998).
15. Sohn, K., Lee, H., Yan, X.: Learning Structured Output Representation using Deep Conditional
Generative Models. Adv. Neural Inf. Process. Sys. 28, 3483–3491 (2015).
16. Yamaguchi, T., Okuda, H.: Prediction of stress concentration at fillets using a neural network
for efficient finite element analysis. Mech. Eng. Lett. 6, 20–00318 (2020). https://doi.org/10.
1299/mel.20-00318.
17. Yamaguchi, T., Okuda, H.: Zooming method for FEA using a neural network. Comput. Struct.
247, 106480 (2021). https://doi.org/10.1016/j.compstruc.2021.106480.
18. Yao, H., Gao, Y., Liu, Y.: FEA-Net: A physics-guided data-driven model for efficient mechan-
ical response prediction. Comput. Methods Appl. Mech. Eng. 363, 112892 (2020). https://doi.
org/10.1016/j.cma.2020.112892.
19. Zienkiewicz, O. C., Taylor, R. L.: The Finite Element Method (5th Ed.) Volume 1: The Basis.
Butterworth-Heinemann (2000).
Part III
Computational Procedures
Chapter 9
Bases for Computer Programming
Abstract It is essential to run various programs on computers to realize “Com-

putational Mechanics with Deep Learning”. This chapter provides an overview of
“Computational Mechanics with Deep Learning” from the perspective of program-
ming. Section 9.1 describes some programs in the field of computational mechanics
used in the Data Preparation Phase, including three topics discussed in the case study:
the element stiffness matrix by using numerical quadrature in the finite element anal-
ysis (Sect. 9.1.1), parameters representing the shape of an element (Sect. 9.1.2), and
NURBS basis functions (Sect. 9.1.3). Section 9.2 discusses some programs in C and
Python for deep learning (neural networks) used in the Training Phase, where the
mathematical formulas are described in detail so that they can be easily compared
with practical programs.
9.1 Computer Programming for Data Preparation Phase
To perform “Computational Mechanics with Deep Learning”, deep understanding

of computational mechanics as well as related computer codes is needed. In this
section, the following programs are discussed: a program for the element integration
(Sect. 9.1.1), that for calculating parameters representing the shape of an element
(Sect. 9.1.2), both have been discussed in Chap. 4, and that for calculating NURBS
basis functions (Sect. 9.1.3) used in Chap. 6.
The programming language used here is C [7, 13], which was developed in the
1970s and is still used widely for numerical computation [14].
9.1.1 Element Stiffness Matrix
The finite element method, a major numerical analysis method, is composed of the
three processes as follows:
Preprocessing process to divide the analysis target into elements,
https://doi.org/10.1007/978-3-031-11847-0_9
286 9 Bases for Computer Programming
Solver process to perform the finite element analysis, and

Postprocessing process to visualize the analysis results.
The solver consists of the following two processes: an element integration process
to construct the global stiffness matrix and that for solving simultaneous linear equa-
tions. The latter is commonly used in various numerical methods, but the former
is unique to the finite element method. In Chap. 4, convergence properties of the
elemental integration are studied, while the program implementation of the elemental
integration is detailed in this section.
In this section, a program to calculate the element stiffness matrix of an 8-node
isoparametric hexahedral element of the first order is shown.
As studied in Chap. 4, the element stiffness matrix [k e ] is achieved as the integral
over the element v e as follows:
∫
[ e]
k = [B]T [D][B]dv (9.1.1)
ve
where [D] is the stress–strain matrix and [B] the strain–displacement matrix. A
homogeneous and isotropic elastic material is assumed here.
This integral is usually calculated by the Gauss–Legendre quadrature as follows
(see Chapt. 4 for detail):
[ e] Σ
n Σ
m Σ
l
( T )
k ≈ [B] [D][B] · |J | | · Hi, j,k (9.1.2)
ξ = ξi
i=1 j=1 k=1
η = ηj
ζ = ζk
where
( n, )m, and l are the numbers of integration points in each axis direction,
ξi , η j , ζk and and Hi, j,k are, respectively, the coordinates and the weights at the
integration points. In the programs in this section, n = m = l is assumed for
simplicity.
In the following, the entire source code of the function esm3D08() that calcu-
lates the element stiffness matrix of an 8-node isoparametric hexahedral element
of the first order is shown, while Table 9.1 summarizes main variables and arrays
employed.
/* esm3D08.c */
void esm3D08(
int *elem,
double **node,
double *mate,
double **esm,
9.1 Computer Programming for Data Preparation Phase 287
Table 9.1 Variables and arrays used in esm3D08()

elem[i] int, 1D array Global node number of the i-th node of the element
node[i][j] double, 2D array The j-th coordinate value of the i-th node. This array includes
all the nodes in the analysis domain
mate[i] double, 1D array Material properties of the element. mate[0]: Young’s modulus,
mate [1]: Poisson’s ration for linear isotropic elasticity
esm[i][j] double, 2D array Element stiffness matrix (Output)
ngauss int Number of quadrature points per axis
gc[i] double, 1D array Coordinate values of the i-th quadrature point
gw[i] double, 1D array Weight of the i-th quadrature point
nfpn int Number of freedom per node
e double Young’s modulus
v double Poisson’s ratio
coord[8][3] double, 2D array Coordinate values of all the nodes belonging to the element
J[3][3] double, 2D array Jacobian matrix
invJ[3][3] double, 2D array Inverse of Jacobian matrix
D[6][6] double, 2D array Stress–strain matrix for 3D linear isotropic elasticity
B[6][24] double, 2D array B matrix
N[8][7] double, 2D array Basis functions
N [ i][6] : The i-th basis function
N [ i][0] : The first derivative of the i-th basis function with
respect to ξ
N[i][1] : The first derivative of the i-th basis function with
respect to η
respect to ζ
respect to x
respect to y
N[i][5] : The first derivative of the i-th basis function with
respect to z
s double Local coordinate value ξ
t double Local coordinate value η
u double Local coordinate value ζ
int ngauss,
double *gc,
double *gw,
int nfpn)
{
int i,j,k,ii,jj,kk,counter,necm=6,nnpe=8,kdim=24;
double e,v,ee,det,coord[8][3],J[3][3],invJ[3][3], [6],
D[6][6],s,t,u,ra,rs,sa,ss,ua,us,N[8][7],B[6][24],
DB[6][24];
double dtmp,ra2,rs2,ssus,ssua,saus,saua;
The above is the header part of the function. This function, taking element data
elem[], nodal data node[][], and material data mate[] as input data, calculates
Eq. (9.1.2) by using the Gauss–Legendre quadrature with ngauss integration points
per axis, and outputs the element stiffness matrix as esm[][].
The node data node[][] represents a two-dimensional array that stores the coor-
dinates of all the nodes in the entire analysis domain. For example, node[20][0]
contains the x-coordinate of the node with the global node number 20, and
node[35][2] the z-coordinate of the node with the global node number 35. The
element data elem[] contains the global numbers of the eight nodes that define an
element. Note that two kinds of node numbers are usually used in the finite element
method: the global node number, which is defined as the sequential number attached
to all the nodes in a whole analysis domain, and the element node number, which is
defined only in an element.
Let’s look at Fig. 9.1, where the left hand side of the figure is an element in
real space, and the number of each node is that of the global one. In the Gauss–
Legendre quadrature, the element is mapped to the local coordinate space as shown
in the right-hand side of the figure, where the integration process is performed. The
number attached to each node from 0 to 7 in the local coordinate space is that of the
element one, and the present program is based on this numbering system.
Note that the element data elem[] in esm3D08() must satisfy the rule that
nodes in an element are arranged in accordance with the ordering of the element node
number. For example, we, respectively, show allowable and not allowable cases for
the element shown in Fig. 9.1 as follows:
7
407 6
(1,1,1)
305
201 4 5
72
17
107
51 2
7 0 1
(-1,-1,-1)
Fig. 9.1 Construction of elem [ ] array

Allowable:
{7, 17, 51, 72, 107, 201, 305, 407}
{7, 51, 17, 107, 72, 201, 407, 305}
{72, 305, 407, 201, 7, 107, 17, 51}
{72, 201, 51, 7, 305, 407, 17, 107}
Not allowable:
{7, 17, 51, 72, 107, 201, 305, 407}
{7, 17, 51, 107, 72, 407, 201, 305}
{7, 107, 17, 51, 72, 305, 407, 201}
for(ii=0;ii<kdim;ii++){
for(jj=0;jj<kdim;jj++) esm[ii][jj] = 0.0;
}
for(i=0;i<nnpe;i++){
ii = elem[i];
for(j=0;j<nfpn;j++) coord[i][j] = node[ii][j];
}
for(i=0;i<necm;i++)
for(j=0;j<kdim;j++) B[i][j] = 0.0;
In the code above, using node[][] and elem[], only the coordinates of the
8 nodes belonging to the element are stored in coord[][], while esm[][] and
B[][] are cleared to zero.
e = mate[0];
v = mate[1];
for(i=0;i<necm;i++)
for(j=0;j<necm;j++) D[i][j] = 0.0;
ee = e*(1.0 - v)/(1.0 + v)/(1.0 - 2.0*v);
D[0][0] = ee;
D[1][1] = ee;
D[2][2] = ee;
D[3][3] = ee*(1.0 - 2.0*v)/2.0/(1.0 - v);
D[4][4] = D[3][3];
D[5][5] = D[3][3];
D[0][1] = ee*v/(1.0 - v);
D[0][2] = D[0][1];
D[1][2] = D[0][1];
D[1][0] = D[0][1];
D[2][0] = D[0][2];
D[2][1] = D[1][2];
In addition, the stress–strain matrix [D] (D[][]) of a three-dimensional isotropic

elastic material is given with the Young’s modulus E and the Poisson’s ratio ν as
⎡ ⎤
ν ν
1 1−ν 1−ν
0 0 0
⎢ ν ν ⎥
⎢ 1 1−ν 0 0 0 ⎥
⎢ 1−ν
ν ν ⎥
E(1 − ν) ⎢ ⎢
1 0 0 0 ⎥
⎥
[D] = 1−ν 1−ν
(9.1.3)
(1 + ν)(1 − 2ν) ⎢ ⎥
1−2ν
⎢ 0 0 0 2(1−ν)
0 0 ⎥
⎢ 0 1−2ν ⎥
⎣ 0 0 0 2(1−ν)
0 ⎦
1−2ν
0 0 0 0 0 2(1−ν)
for(i=0;i<ngauss;i++){
s = gc[i];
for(j=0;j<ngauss;j++){
t = gc[j];
for(k=0;k<ngauss;k++){
u = gc[k];
This is the starting part of the triple-nested loop that calculates the contribution
of each integration point in turn in the Gauss–Legendre quadrature. The variables s,
t, and u correspond to the local coordinates ξ, η, and ζ , respectively. As for√gc[]
and√gw[], for example, when ngauss is 2, gc[0] and gc[1] are −1/ 3 and
1/ 3 , respectively, and gw[0] and gw[1] are both 1.0. (See Chap. 4 for detail.)
ra = (1.0 + s)*0.5;
rs = (1.0 - s)*0.5;
sa = (1.0 + t)*0.5;
ss = (1.0 - t)*0.5;
ua = (1.0 + u)*0.5;
us = (1.0 - u)*0.5;
ssus = ss*us;
saus = sa*us;
ssua = ss*ua;
saua = sa*ua;
N[0][6] = rs*ssus;
N[1][6] = ra*ssus;
N[2][6] = ra*saus;
N[3][6] = rs*saus;
N[4][6] = rs*ssua;
N[5][6] = ra*ssua;
N[6][6] = ra*saua;
N[7][6] = rs*saua;
N[0][0] = -0.5*ssus;
N[1][0] = 0.5*ssus;
N[2][0] = 0.5*saus;
N[3][0] = -0.5*saus;
N[4][0] = -0.5*ssua;
N[5][0] = 0.5*ssua;
N[6][0] = 0.5*saua;
N[7][0] = -0.5*saua;
rs2 = 0.5*rs;
ra2 = 0.5*ra;
N[0][1] = -rs2*us;
N[1][1] = -ra2*us;
N[2][1] = ra2*us;
N[3][1] = rs2*us;
N[4][1] = -rs2*ua;
N[5][1] = -ra2*ua;
N[6][1] = ra2*ua;
N[7][1] = rs2*ua;
N[0][2] = -rs2*ss;
N[1][2] = -ra2*ss;
N[2][2] = -ra2*sa;
N[3][2] = -rs2*sa;
N[4][2] = rs2*ss;
N[5][2] = ra2*ss;
N[6][2] = ra2*sa;
N[7][2] = rs2*sa;
In the code above, the values of the basis functions at the integration point are
calculated, where N[i][6] denotes the value of the basis function Ni (ξ, η, ζ ),
which is defined as follows:
1
N[0] [6] = N0 (ξ, η, ζ ) = (1 − ξ )(1 − η)(1 − ζ ) (9.1.4)
8
1
N[1] [6] = N1 (ξ, η, ζ ) = (1 + ξ )(1 − η)(1 − ζ ) (9.1.5)
8
1
N[2] [6] = N2 (ξ, η, ζ ) = (1 + ξ )(1 + η)(1 − ζ ) (9.1.6)
8
1
N[3] [6] = N3 (ξ, η, ζ ) = (1 − ξ )(1 + η)(1 − ζ ) (9.1.7)
8
1
N[4] [6] = N4 (ξ, η, ζ ) = (1 − ξ )(1 − η)(1 + ζ ) (9.1.8)
8
1
N[5] [6] = N5 (ξ, η, ζ ) = (1 + ξ )(1 − η)(1 + ζ ) (9.1.9)
8
1
N[6] [6] = N6 (ξ, η, ζ ) = (1 + ξ )(1 + η)(1 + ζ ) (9.1.10)
8
1
N[7] [6] = N7 (ξ, η, ζ ) = (1 − ξ )(1 + η)(1 + ζ ) (9.1.11)
8
N[i][0] in the code denotes the value of the partial derivative of the basis
function Ni (ξ, η, ζ ) with respect to ξ , which is defined as
∂ N0 (ξ, η, ζ ) 1
N[0] [0] = = − (1 − η)(1 − ζ ) (9.1.12)
∂ξ 8
∂ N1 (ξ, η, ζ ) 1
N[1] [0] = = (1 − η)(1 − ζ ) (9.1.13)
∂ξ 8
∂ N2 (ξ, η, ζ ) 1
N[2] [0] = = (1 + η)(1 − ζ ) (9.1.14)
∂ξ 8
∂ N3 (ξ, η, ζ ) 1
N[3] [0] = = − (1 + η)(1 − ζ ) (9.1.15)
∂ξ 8
∂ N4 (ξ, η, ζ ) 1
N[4] [0] = = − (1 − η)(1 + ζ ) (9.1.16)
∂ξ 8
∂ N5 (ξ, η, ζ ) 1
N[5] [0] = = (1 − η)(1 + ζ ) (9.1.17)
∂ξ 8
∂ N6 (ξ, η, ζ ) 1
N[6] [0] = = (1 + η)(1 + ζ ) (9.1.18)
∂ξ 8
∂ N7 (ξ, η, ζ ) 1
N[7] [0] = = − (1 + η)(1 + ζ ) (9.1.19)
∂ξ 8
N[i][1] in the code denotes the vue of the partial derivative of the basis function
Ni (ξ, η, ζ ). with respect to η, which is defineds
∂ N0 (ξ, η, ζ ) 1
N[0] [1] = = − (1 − ξ )(1 − ζ ) (9.1.20)
∂η 8
∂ N1 (ξ, η, ζ ) 1
N[1] [1] = = − (1 + ξ )(1 − ζ ) (9.1.21)
∂η 8
∂ N2 (ξ, η, ζ ) 1
N[2] [1] = = (1 + ξ )(1 − ζ ) (9.1.22)
∂η 8
∂ N3 (ξ, η, ζ ) 1
N[3] [1] = = (1 − ξ )(1 − ζ ) (9.1.23)
∂η 8
∂ N4 (ξ, η, ζ ) 1
N[4] [1] = = − (1 − ξ )(1 + ζ ) (9.1.24)
∂η 8
∂ N5 (ξ, η, ζ ) 1
N[5] [1] = = − (1 + ξ )(1 + ζ ) (9.1.25)
∂η 8
∂ N6 (ξ, η, ζ ) 1
N[6] [1] = = (1 + ξ )(1 + ζ ) (9.1.26)
∂η 8
∂ N7 (ξ, η, ζ ) 1
N[7] [1] = = (1 − ξ )(1 + ζ ) (9.1.27)
∂η 8
And N[i][2] in the code denotes the value of the partial derivative of the basis
function Ni (ξ, η, ζ ) with respect to ζ , which is defined as
∂ N0 (ξ, η, ζ ) 1
N[0] [2] = = − (1 − ξ )(1 − η) (9.1.28)
∂ζ 8
∂ N1 (ξ, η, ζ ) 1
N[1] [2] = = − (1 + ξ )(1 − η) (9.1.29)
∂ζ 8
∂ N2 (ξ, η, ζ ) 1
N[2] [2] = = − (1 + ξ )(1 + η) (9.1.30)
∂ζ 8
∂ N3 (ξ, η, ζ ) 1
N[3] [2] = = − (1 − ξ )(1 + η) (9.1.31)
∂ζ 8
∂ N4 (ξ, η, ζ ) 1
N[4] [2] = = (1 − ξ )(1 − η) (9.1.32)
∂ζ 8
∂ N5 (ξ, η, ζ ) 1
N[5] [2] = = (1 + ξ )(1 − η) (9.1.33)
∂ζ 8
∂ N6 (ξ, η, ζ ) 1
N[6] [2] = = (1 + ξ )(1 + η) (9.1.34)
∂ζ 8
∂ N7 (ξ, η, ζ ) 1
N[7] [2] = = (1 − ξ )(1 + η) (9.1.35)
∂ζ 8
for(ii=0;ii<nfpn;ii++){
for(jj=0;jj<nfpn;jj++){
J[ii][jj] = 0.0;
for(kk=0;kk<nnpe;kk++){
J[ii][jj] += N[kk][ii]*coord[kk][jj];
}
}
}
In the code above, based on the basic equation for isoparametric elements as
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
x x(ξ, η, ζ ) Σn Xi
⎝ y ⎠ = ⎝ y(ξ, η, ζ ) ⎠ = Ni (ξ, η, ζ ) · ⎝ Yi ⎠ (9.1.36)
z z(ξ, η, ζ ) i=1 Zi
the Jacobian matrix [J ] of the coordinate transformation is calculated by summing

the products of the derivatives of the basis functions and the nodal coordinates as
⎡ ⎤
Σ
n
∂ Ni Σ
n
∂ Ni Σ
n
∂ Ni
⎡ ∂x ∂y ∂z
⎤ ⎢ ∂ξ
Xi ∂ξ
ZYi
i⎥ ∂ξ
⎢ i=1 i=1 i=1 ⎥
⎢
∂ξ ∂ξ ∂ξ
∂z ⎥
⎢ Σn Σn Σn ⎥
[J ] = ⎣ ∂∂ηx∂y
= ⎢ ∂ Ni ∂ Ni ∂ Ni ⎥
∂η ⎦
∂η
X Y
⎢ i=1 ∂η i i=1 ∂η i i=1 ∂η i ⎥ Z (9.1.37)
∂x ∂y ∂z ⎢ n ⎥
∂ζ ∂ζ ∂ζ ⎣ Σ ∂ Ni Σn
∂ Ni Σn
∂ Ni ⎦
∂ζ
X i ∂ζ i
Y ∂ζ
Z i
i=1 i=1 i=1
det = J[0][0]*J[1][1]*J[2][2]
+ J[0][1]*J[1][2]*J[2][0]
+ J[0][2]*J[1][0]*J[2][1]
- J[0][0]*J[1][2]*J[2][1]
- J[0][1]*J[1][0]*J[2][2]
- J[0][2]*J[1][1]*J[2][0] ;
invJ[0][0] = (J[1][1]*J[2][2] - J[1][2]*J[2][1])/det;
invJ[0][1] = (J[0][2]*J[2][1] - J[0][1]*J[2][2])/det;
invJ[0][2] = (J[0][1]*J[1][2] - J[1][1]*J[0][2])/det;
invJ[1][0] = (J[1][2]*J[2][0] - J[1][0]*J[2][2])/det;
invJ[1][1] = (J[0][0]*J[2][2] - J[0][2]*J[2][0])/det;

invJ[1][2] = (J[1][0]*J[0][2] - J[0][0]*J[1][2])/det;
invJ[2][0] = (J[1][0]*J[2][1] - J[1][1]*J[2][0])/det;
invJ[2][1] = (J[0][1]*J[2][0] - J[0][0]*J[2][1])/det;
invJ[2][2] = (J[0][0]*J[1][1] - J[0][1]*J[1][0])/det;
The determinant and inverse of the Jacobian matrix [J ] are calculated in the
code above. The former is calculated directly for two-by-two matrices, and using
a recurrence formula for third-order and higher matrices, respectively, while the
formula for calculating the determinant of three-by-three matrix is shown as
| |
| a00 a01 a02 | | | | | | |
| | | | | | | |
| a10 a11 a12 | = a00 | a11 a12 | − a01 | a10 a12 | + a02 | a10 a11 |
| | | a21 a22 | | a20 a22 | | a20 a21 |
|a a a |
20 21 22
= a00 (a11 a22 − a12 a21 ) − a01 (a10 a22 − a12 a20 )
+ a02 (a10 a21 − a11 a20 ) (9.1.38)
The inverse matrix can be obtained using the formulas of linear algebra. Let the
n-th order matrix [A] be given as
⎡ ⎤
a00 · · · a0,n−1
⎢ ⎥
[A] = ⎣ ... . . . ..
. ⎦ (9.1.39)
an−1,0 · · · an−1,n−1
Then, the inverse of the matrix above is given by
1 [ ]
[A]−1 = Ã (9.1.40)
|[A]|
[ ]
where |[A]| is the determinant of the matrix [A]. Ã is its adjugate matrix given by
⎡ ⎤
[ ] ã00 · · · ãn−1,0
⎢ .. .. .. ⎥
Ã = ⎣ . . . ⎦ (9.1.41)
ã0,n−1 · · · ãn−1,n−1
where ãi j is defined by

|[ ]|
ãi j = (−1)i+ j · | i j A | (9.1.42)
with
⎡ ⎤
a00 · · · a0, j−1 a0, j+1 · · · a0,n−1
⎢ . . .. .. .. .. ⎥
⎢ .. .. . . . . ⎥
⎢ ⎥
[i j ] ⎢
⎢ ai−1,0 · · · ai−1, j−1
⎥
ai−1, j+1 · · · ai−1,n−1 ⎥
A =⎢ ⎥ (9.1.43)
⎢ ai+1,0 · · · ai+1, j−1 ai+1, j+1 · · · ai+1,n−1 ⎥
⎢ . .. .. .. ⎥
⎢ . .. .. ⎥
⎣ . . . . . . ⎦
an−1,0 · · · an−1, j−1 an−1, j+1 · · · an−1,n−1
Thus, for a matrix [A] shown as

⎡ ⎤
a00 a01 a02
[A] = ⎣ a10 a11 a12 ⎦ (9.1.44)
a20 a21 a22
[ ]
its adjugate matrix Ã is given by
⎡ | | | | | | ⎤
| a11 a12 | | a01 a02 | | a01 a02 |
|
⎢ | a a || −|| |
|
| |
| a11 a12 | ⎥
⎢ a a |⎥
[ ] ⎢ || a a || | | |
21 22 21 22
⎢ | 10 12 | | a00 a02 | | a00 a02 | ⎥
Ã = ⎢ −| | | −|| |⎥
⎢ | a20 a22| | | a20 a22 | |⎥ (9.1.45)
⎢ | | | a 10 a | ⎥
12
⎥
⎣ || a10 a11 || | a00 a01 |
| |
| a00 a01 | ⎦
| |
| a20 a21 | −|
a20 a21 | | a10 a11 |
for(ii=0;ii<nnpe;ii++){
N[ii][3] = 0.0;
N[ii][4] = 0.0;
N[ii][5] = 0.0;
for(jj=0;jj<3;jj++){
N[ii][3] += invJ[0][jj]*N[ii][jj];
}
}
In the code above, the first-order derivatives of the basis functions with respect
to x, y, and z are calculated. By the chain rule of differentiation, the relationship
between the first-order derivatives of the basis functions with respect to x, y, and z
and those with respect to ξ, η, and ζ is written as follows:
⎛ ∂N ⎞ ⎛ ∂N ∂x ∂ Ni ∂y ∂ Ni ∂z
⎞ ⎡ ∂x ∂y ∂z
⎤⎛ ⎞
i i
+ + ∂ Ni
∂ξ ∂x ∂ξ ∂y ∂ξ ∂z ∂ξ ∂ξ ∂ξ ∂ξ ∂x
⎜ ∂ Ni ⎟ ⎜ ∂ Ni ∂x ∂ Ni ∂y ∂ Ni ∂z ⎟ ⎢ ∂x ∂y ∂z ⎥⎜ ∂ Ni ⎟
⎝ ∂η ⎠=⎝ ∂x ∂η
+ ∂y ∂η
+ ∂z ∂η ⎠=⎣ ∂η ∂η ∂η ⎦⎝ ∂y ⎠ (9.1.46)
∂ Ni ∂ Ni ∂x ∂ Ni ∂y ∂ Ni ∂z ∂x ∂y ∂z ∂ Ni
∂ζ ∂x ∂ζ
+ ∂y ∂ζ
+ ∂z ∂ζ ∂ζ ∂ζ ∂ζ ∂z
Therefore, by using the inverse of the Jacobian matrix, the first-order derivatives
of the basis functions with respect to x, y, and z are calculated as
⎛ ⎞ ⎛ ∂N ⎞
∂ Ni i
∂x ∂ξ
⎜ ∂ Ni ⎟ −1 ⎜ ∂ N ⎟
⎝ ∂y ⎠ = [J ] ⎝ ∂ηi ⎠ (9.1.47)
∂ Ni ∂ Ni
∂z ∂ζ
This relation is implemented in the code above.
for(ii=0;ii<nnpe;ii++){
jj = ii*nfpn;
B[0][jj] = N[ii][3];
B[1][1+jj] = N[ii][4];
B[2][2+jj] = N[ii][5];
B[3][jj] = N[ii][4];
B[3][1+jj] = N[ii][3];
B[4][1+jj] = N[ii][5];
B[4][2+jj] = N[ii][4];
B[5][jj] = N[ii][5];
B[5][2+jj] = N[ii][3];
}
In this code, the strain–displacement matrix [B] is calculated, which is defined as

the product of the matrix of basis functions [N ] and the partial differential operator
matrix [L] (see Chap. 4), respectively, given as
⎡ ⎤
N0 0 0 N7 0 0
[N ] = ⎣ 0 N0 0 · · · 0 N7 0 ⎦ (9.1.48)
0 0 N0 0 0 N7
and
⎡ ⎤
∂
0 0 ∂x
⎢ 0 ∂ 0 ⎥
⎢ ∂y ⎥
⎢ ⎥
⎢ 0 0 ∂∂z ⎥
[L] = ⎢
⎢ ∂ ∂ 0
⎥
⎥ (9.1.49)
⎢ ∂y ∂x ⎥
⎢ ∂ ∂ ⎥
⎣ 0 ∂z ∂ y ⎦
∂
∂z
0 ∂∂x
Then, the strain–displacement matrix [B] is written as follows:

⎡ ⎤
∂ N0 ∂ N7
∂x
0
0 ∂x
0 0
⎢ 0 ∂ N0 0 · · · 0 ∂ N7 0 ⎥
⎢ ∂y ∂y ⎥
⎢ ∂ N7 ⎥
⎢ 0 0 ∂∂Nz0 0 0 ⎥
[B] = [L] [N ] = ⎢
⎢ ∂ N0 ∂ N0 0
∂z ⎥
(9.1.50)
⎢ ∂y ∂x
∂ N7 ∂ N7
∂y ∂x
0 ⎥ ⎥
⎢ ∂ N0 ∂ N0 ∂ N7 ∂ N7 ⎥
⎣ 0 ∂z ∂y
· · · 0 ∂z ∂y ⎦
∂ N0 ∂ N0 ∂ N7 ∂ N7
∂z
0 ∂x ∂z
0 ∂x
As is shown in Eq. (9.1.50), the strain–displacement matrix [B] is a matrix

consisting of the first-order derivatives of the basis functions with respect to x, y,
and z.
for(ii=0;ii<necm;ii++){
for(jj=0;jj<kdim;jj++){
DB[ii][jj] = 0.0;
for(kk=0;kk<necm;kk++){
DB[ii][jj] += D[ii][kk]*B[kk][jj];
}
}
}
dtmp = gw[i]*gw[j]*gw[k]*det;
for(ii=0;ii<kdim;ii++){
for(jj=0;jj<kdim;jj++){
for(kk=0;kk<necm;kk++){
esm[ii][jj]
+= B[kk][ii]*DB[kk][jj]*dtmp;
}
}
}
}
}
}
}
/* End of esm3D08.c */
In the code above, using [D], [B], and |[J ]| obtained so far, the element stiffness
matrix is calculated based on the following equation.
[ e] Σ Σ Σ ( T
ng−1 ng−1 ng−1
)
k ≈ [B] [D][B] · |[J ]| | · Hi, j,k (9.1.51)
ξ = ξi
i=0 j=0 k=0
η = ηj
ζ = ζk
where “ng” is the number of integration

( )points per axis (ngauss) and Hi, j,k the
weight at the integration point ξi , η j , ζk given by the product of the weights in
each axis direction as
Hi, j,k = Hi · H j · Hk (9.1.52)
Note that the variable dtmp in the code is the product of Hi, j,k and |[J ]|.
In the innermost part of the triple-nested loop for the integration points, contri-
bution to the element stiffness matrix of each integration point is calculated, all of
which are finally summed to obtain the element stiffness matrix.
The element stiffness matrix of each element is calculated by esm3D08()
above, and the global stiffness matrix is constructed by adding the element stiff-
ness matrices of all the elements. The element stiffness matrix is arranged assuming
the displacement vector as follows:
( )T
U0 V0 V0 U1 V1 W1 · · · · · · · · · U7 V7 W7 (9.1.53)
( )T
where Ui Vi Wi is the displacement vector for the element node number i.
On the other hand, the global stiffness matrix is arranged according to the displace-
ment vectors of all the nodes in the entire domain. Let the total number of nodes be
N, and the global displacement vector is expressed as follows:
( )T
U0 V0 W0 U1 V1 W1 · · · · · · · · · U N −1 VN −1 W N −1 (9.1.54)
( )T
where U j V j W j is the displacement vector for the global node number j.
Thus, the construction of the global stiffness matrix is performed by adding each
element stiffness matrix to the appropriate position in the global stiffness matrix
according to the correspondence between the global node number and the element
one.
A program code for constructing a global stiffness matrix based on the element
stiffness matrices is shown as follows:
for(i=0;i<N*nfpn;i++)
for(j=0;j<N*nfpn;j++) gsm[i][j] = 0.0 ;
for(iel=0;iel<nelem;iel++){
esm3D08(elem[iel],node,mate[iel],esm,ngauss,gc,gw,nfpn);
idof = elem[iel][i]*nfpn;
for(j=0;j<nnpe;j++){
jdof = elem[iel][j]*nfpn;
for(ia=0;ia<nfpn;ia++){
for(ja=0;ja<nfpn;ja++){
gsm[idof+ia][jdof+ja]
+= esm[i*nfpn+ia][j*nfpn+ja] ;
}
}
}
}
}
In the code above, it is assumed that the element data of the iel-th element
is stored in elem [ iel ][ ] and its material data in mate [ iel ][ ]. The
two-dimensional array gsm [][ ], which is to store the global stiffness matrix, is
cleared to zero at first, and each time an element stiffness matrix is calculated, its
components are added to the appropriate positions in the global stiffness matrix.
9.1.2 Mesh Quality
As we have seen in Chap. 4, the shape of an element affects the accuracy of the
elemental integration, and the error in the element stiffness matrix directly affects
the accuracy of the analysis results. For this reason, various parameters have been
used as criteria for judging the quality of the element shape during mesh genera-
tion to improve the initial mesh [3]. In this subsection, some programs to calculate
parameters representing the element geometry are discussed.
To begin with, the algebraic shape metric (AlgebraicShapeMetric) of an element
has been proposed as a measure of the element shape [9, 10]. The procedure to
calculate the AlgebraicShapeMetric of a hexahedral element of the first order shown
in Fig. 9.2 is given here. Assume that each node of a hexahedral element is numbered
as shown in the figure (the element node number). The coordinates of each node are
given in Table 9.2. Then, for the l-th node of an element, the matrix Al is defined as
⎡ ⎤
x i − xl x j − xl x k − xl
Al = ⎣ yi − yl y j − yl yk − yl ⎦ (9.1.55)
z i − zl z j − zl z k − zl
where i, j, and k are given for the l-th node according to Table 9.3, and (xi , yi , z i )
are the coordinate values of the i-th node.
Each column of Al above is a vector directing from the l-th node to the adjacent
node. Similarly, for the element of the cube shape (see Table 9.4 for the nodal
coordinates), the matrix Wl is defined as
Fig. 9.2 Node numbering in 7

an element
4
6
5
0 3
Table 9.2 Coordinate values

l (xl , yl , zl )
of nodes
0 (x0 , y0 , z 0 )
1 (x1 , y1 , z 1 )
2 (x2 , y2 , z 2 )
3 (x3 , y3 , z 3 )
4 (x4 , y4 , z 4 )
5 (x5 , y5 , z 5 )
6 (x6 , y6 , z 6 )
7 (x7 , y7 , z 7 )
Table 9.3 Nodal ordering for

l i j k
hexahedral element
0 1 3 4
1 2 0 5
2 3 1 6
3 0 2 7
4 7 5 0
5 4 6 1
6 5 7 2
7 6 4 3
⎡ ⎤
xic − xlc x cj − xlc xkc − xlc
⎢ ⎥
Wl = ⎣ yic − ylc y cj − ylc ykc − ylc ⎦ (9.1.56)
z i − zl z j − zl z k − zl
c c c c c c
( )
where i, j, and k are also given by Table 9.3, and xic , yic , z ic are the coordinate
values of the i-th node of the cubic element.
The matrix Tl is defined as the product of the matrix Al and the inverse of the
matrix Wl .
Table 9.4 Coordinate values ( )

l xic , yic , z ic
of nodes for a cubic element
0 (0, 0, 0)
1 (1, 0, 0)
2 (1, 1, 0)
3 (0, 1, 0)
4 (0, 0, 1)
5 (1, 0, 1)
6 (1, 1, 1)
7 (0, 1, 1)
Tl = Al Wl−1 (9.1.57)
Using the matrix Tl and its inverse, κl , called the condition number, is defined as
∥ ∥
κl = ∥Tl ∥∥Tl−1 ∥ (9.1.58)
where ∥A∥ is the Frobenius norm of the matrix A, defined to be the square root of
the sum of squares of all components of the matrix A as
√ Σ
∥A∥ = ai2j (9.1.59)
i, j
Finally, the AlgebraicShapeMetric f a hexahedral element is defined with the

condition numbers κl at each node by
8
f = Σ8 ( )2 (9.1.60)
κl
l=1 3
Note that the AlgebraicShapeMetric f of an element takes the value in the range
0.0 < f ≤ 1.0, and f = 1.0 for the case of cubic shape.
Let’s take a look at ElementShapeMetric . c, a program code for calculating
the AlgebraicShapeMetric of an element. Main variables and arrays used in the code
are listed in Table 9.5.
/* ElementShapeMetric.c */
/*---------------------------------------------------*/
void inv_mat3(
double invA[][3],
double A[][3])
{
Table 9.5 Variables and arrays used in ElementShapeMetric.c

A[3][3] double, 2D array Matrix defined by Eq. (9.1.55)
invA[3][3] double, 2D array Inverse of the matrix A
W[3][3] double, 2D array Matrix defined by Eq. (9.1.56) for the element of cube shape
invW[3][3] double, 2D array Inverse of the matrix W
T[3][3] double, 2D array Matrix defined by Eq. (9.1.57)
invT[3][3] double, 2D array Inverse of the matrix T
kp[i] double, 1D array Condition number related to the i-th node
idx[i][j] int, 2D array Nodal ordering defined by Table 9.3
cube[8][3] double, 2D array Coordinate values of a cubic element
double det;
det = A[0][0]*A[1][1]*A[2][2] + A[0][1]*A[1][2]*A[2][0]
+ A[0][2]*A[1][0]*A[2][1] - A[0][0]*A[1][2]*A[2][1]
- A[0][1]*A[1][0]*A[2][2] - A[0][2]*A[1][1]*A[2][0] ;
invA[0][0] = (A[1][1]*A[2][2] - A[1][2]*A[2][1])/det;
invA[0][1] = (A[0][2]*A[2][1] - A[0][1]*A[2][2])/det;
invA[0][2] = (A[0][1]*A[1][2] - A[1][1]*A[0][2])/det;
invA[1][0] = (A[1][2]*A[2][0] - A[1][0]*A[2][2])/det;
invA[1][1] = (A[0][0]*A[2][2] - A[0][2]*A[2][0])/det;
invA[1][2] = (A[1][0]*A[0][2] - A[0][0]*A[1][2])/det;
invA[2][0] = (A[1][0]*A[2][1] - A[1][1]*A[2][0])/det;
invA[2][1] = (A[0][1]*A[2][0] - A[0][0]*A[2][1])/det;
invA[2][2] = (A[0][0]*A[1][1] - A[0][1]*A[1][0])/det;
}
/*---------------------------------------------------*/
void matMulti3(
double C[][3],
double A[][3],
double B[][3])
{
int i,j,k;
for(i=0;i<3;i++){
for(j=0;j<3;j++){
C[i][j] = 0.0 ;
for(k=0;k<3;k++) C[i][j] += A[i][k]*B[k][j] ;
}
}
}
/*---------------------------------------------------*/
double f_norm3(
double A[][3])
{
int i,j;
double dsum;
dsum = 0.0 ;
for(i=0;i<3;i++)
for(j=0;j<3;j++)
dsum += A[i][j]*A[i][j] ;
return dsum ;
}
/*---------------------------------------------------*/
double shape_metric(
int *elem,
double **node,
int nnpe,
int nfpn)
{
int i,j,k,ia,ib,ic,id,inode;
int idx[8][3]
= {{1,3,4},{2,0,5},{3,1,6},{0,2,7},
{7,5,0},{4,6,1},{5,7,2},{6,4,3}};
double kp[8],W[3][3],A[3][3],T[3][3],invW[3][3],
invT[3][3],cond;
double cube[8][3]
= {{0.0, 0.0, 0.0},{1.0, 0.0, 0.0},
{1.0, 1.0, 0.0},{0.0, 1.0, 0.0},
{0.0, 0.0, 1.0},{1.0, 0.0, 1.0},
{1.0, 1.0, 1.0},{0.0, 1.0, 1.0}} ;
for(inode=0;inode<nnpe;inode++){
ia = elem[inode] ;
ic = inode ;
for(j=0;j<nfpn;j++){
ib = elem[idx[inode][j]] ;
id = idx[inode][j] ;
for(i=0;i<nfpn;i++)
A[i][j] = node[ib][i] - node[ia][i] ;
for(i=0;i<nfpn;i++)
W[i][j] = cube[id][i] - cube[ic][i] ;
}
inv_mat3(invW,W) ;
matMulti3(T,A,invW) ;
inv_mat3(invT,T) ;
kp[inode] = f_norm3(T)*f_norm3(invT) ;
}
for(inode=0,cond=0.0;inode<nnpe;inode++)
cond += kp[inode]/9 ;
return 8.0/cond ;
}
/*---------------------------------------------------*/
In the code above, inv_mat3() is a function that calculates the inverse matrix
invA of a 3-by-3 matrix A given. See Sect. 9.1.1 for details.
matMulti3() is a function that calculates the product C = AB. Let the

components of matrices A, B, and C be denoted by ai j , bi j , and ci j , respectively, and
ci j is obtained by
Σ
2
ci j = aik · bk j (9.1.61)
k=0
f_norm3() is a function that calculates the Frobenius norm of a 3-by-3 matrix

given as an argument.
shape_metric() is a function that calculates the AlgebraicShapeMetric. The
array idx[8][3] contains the same data as in Table 9.3. The array cube[8][3]
contains the nodal coordinates of the element of the cube shape as shown in Table
9.4.
The structure of the program is summarized as follows:
Loop on nodes
Define a matrix A [Eq. (9.1.55)].
Define a matrix W [Eq. (9.1.56)].
inv_mat3() Calculate the inverse matrix of W .
matMulti3() Define a matrix T [Eq. (9.1.57)]
inv_mat3() Calculate the inverse matrix of T .
Calculate the condition number kp at each node [Eqs. (9.1.58) and (9.1.59)]
Calculate the sum of squares of kp at all the nodes.
Return the AlgebraicShapeMetric f of the element [Eq. (9.1.60)].
Let’s calculate the AlgebraicShapeMetric of an element. Assume the shape of the
element is cubic with only the z-coordinate of one node moved, as shown in Table
9.6.
According to Eq. (9.1.55), the matrices Al of all the nodes are defined as follows:
Table 9.6 Coordinate values

l (xl , yl , zl )
of nodes for a distorted cubic
element 0 (0, 0, 0)
1 (1, 0, 0)
2 (1, 1, 0)
3 (0, 1, 0)
4 (0, 0, z)
5 (1, 0, 1)
6 (1, 1, 1)
7 (0, 1, 1)
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
x1 − x0 x3 − x0 x4 − x0 1−0 0−0 0−0 100
A0 = ⎣ y1 − y0 y3 − y0 y4 − y0 ⎦ = ⎣ 0 − 0 1 − 0 0 − 0 ⎦ = ⎣ 0 1 0 ⎦
z1 − z0 z3 − z0 z4 − z0 0−0 0−0 z−0 00z
(9.1.62)
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
x2 − x1 x0 − x1 x5 − x1 1−1 0−1 1−1 0 −1 0
A1 = ⎣ y2 − y1 y0 − y1 y5 − y1 ⎦ = ⎣ 1 − 0 0 − 0 0 − 0 ⎦ = ⎣ 1 0 0 ⎦
z2 − z1 z0 − z1 z5 − z1 0−0 0−0 1−0 0 0 1
(9.1.63)
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
x3 − x2 x1 − x2 x6 − x2 0−1 1−1 1−1 −1 0 0
A2 = ⎣ y3 − y2 y1 − y2 y6 − y2 ⎦ = ⎣ 1 − 1 0 − 1 1 − 1 ⎦ = ⎣ 0 −1 0 ⎦
z3 − z2 z1 − z2 z6 − z2 0−0 0−0 1−0 0 0 1
(9.1.64)
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
x0 − x3 x2 − x3 x7 − x3 0−0 1−0 0−0 0 10
A3 = ⎣ y0 − y3 y2 − y3 y7 − y3 ⎦ = ⎣ 0 − 1 1 − 1 1 − 1 ⎦ = ⎣ −1 0 0 ⎦
z0 − z3 z2 − z3 z7 − z3 0−0 0−0 1−0 0 01
(9.1.65)
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
x7 − x4 x5 − x4 x0 − x4 0−0 1−0 0−0 0 1 0
A4 = ⎣ y7 − y4 y5 − y4 y0 − y4 ⎦ = ⎣ 1 − 0 0 − 0 0 − 0 ⎦ = ⎣ 1 0 0 ⎦
z7 − z4 z5 − z4 z0 − z4 1−z 1−z 0−z 1 − z 1 − z −z
(9.1.66)
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
x4 − x5 x6 − x5 x1 − x5 0−1 1−1 1−1 −1 0 0
A5 = ⎣ y4 − y5 y6 − y5 y1 − y5 ⎦ = ⎣ 0 − 0 1 − 0 0 − 0 ⎦ = ⎣ 0 1 0 ⎦
z4 − z5 z6 − z5 z1 − z5 z−1 1−1 0−1 z − 1 0 −1
(9.1.67)
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
x5 − x6 x7 − x6 x2 − x6 1−1 0−1 1−1 0 −1 0
A6 = ⎣ y5 − y6 y7 − y6 y2 − y6 ⎦ = ⎣ 0 − 1 1 − 1 1 − 1 ⎦ = ⎣ −1 0 0 ⎦
z5 − z6 z7 − z6 z2 − z6 1−1 1−1 0−1 0 0 −1
(9.1.68)
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
x6 − x7 x4 − x7 x3 − x7 1−0 0−0 0−0 1 0 0
A7 = ⎣ y6 − y7 y4 − y7 y3 − y7 ⎦ = ⎣ 1 − 1 0 − 1 1 − 1 ⎦ = ⎣ 0 −1 0 ⎦
z6 − z7 z4 − z7 z3 − z7 1−1 z−1 0−1 0 z − 1 −1
(9.1.69)
Similarly, according to Eq. (9.1.56), the matrices Wl of all the nodes are defined
as follows:
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
x1c − x0c x3c − x0c x4c − x0c 1−0 0−0 0−0 100
W0 = ⎣ y1c − y0c y3c − y0c y4c − y0c ⎦ = ⎣ 0 − 0 1 − 0 0 − 0 ⎦ = ⎣ 0 1 0 ⎦
z 1c − z 0c z 3c − z 0c z 4c − z 0c 0−0 0−0 1−0 001
(9.1.70)
⎡ c ⎤ ⎡ ⎤ ⎡ ⎤
x2 − x1c x0c − x1c x5c − x1c 1−1 0−1 1−1 0 −1 0
W1 = ⎣ y c − y c y c − y c y c − y c ⎦ = ⎣ 1 − 0 0 − 0 0 − 0 ⎦ = ⎣ 1 0 0 ⎦
2 1 0 1 5 1
z 2c − z 1c z 0c − z 1c z 5c − z 1c 0−0 0−0 1−0 0 0 1
(9.1.71)
⎡ c ⎤ ⎡ ⎤ ⎡ ⎤
x3 − x2c x1c − x2c x6c − x2c 0−1 1−1 1−1 −1 0 0
W2 = ⎣ y3c − y2c y1c − y2c y6c − y2c ⎦ = ⎣ 1 − 1 0 − 1 1 − 1 ⎦ = ⎣ 0 −1 0 ⎦
z 3c − z 2c z 1c − z 2c z 6c − z 2c 0−0 0−0 1−0 0 0 1
(9.1.72)
⎡ c ⎤ ⎡ ⎤ ⎡ ⎤
x0 − x3c x2c − x3c x7c − x3c 0−0 1−0 0−0 0 10
W3 = ⎣ y c − y c y c − y c y c − y c ⎦ = ⎣ 0 − 1 1 − 1 1 − 1 ⎦ = ⎣ −1 0 0 ⎦
0 3 2 3 7 3
z 0c − z 3c z 2c − z 3c z 7c − z 3c 0−0 0−0 1−0 0 01
(9.1.73)
⎡ c ⎤ ⎡ ⎤ ⎡ ⎤
x7 − x4c x5c − x4c x0c − x4c 0−0 1−0 0−0 01 0
W4 = ⎣ y7c − y4c y5c − y4c y0c − y4c ⎦ = ⎣ 1 − 0 0 − 0 0 − 0 ⎦ = ⎣ 1 0 0 ⎦
z 7c − z 4c z 5c − z 4c z 0c − z 4c 1−1 1−1 0−1 0 0 −1
(9.1.74)
⎡ c ⎤ ⎡ ⎤ ⎡ ⎤
x4 − x5c x6c − x5c x1c − x5c 0−1 1−1 1−1 −1 0 0
W5 = ⎣ y4c − y5c y6c − y5c y1c − y5c ⎦ = ⎣ 0 − 0 1 − 0 0 − 0 ⎦ = ⎣ 0 1 0 ⎦
z 4c − z 5c z 6c − z 5c z 1c − z 5c 1−1 1−1 0−1 0 0 −1
(9.1.75)
⎡ c ⎤ ⎡ ⎤ ⎡ ⎤
x5 − x6c x7c − x6c x2c − x6c 1−1 0−1 1−1 0 −1 0
W6 = ⎣ y5c − y6c y7c − y6c y2c − y6c ⎦ = ⎣ 0 − 1 1 − 1 1 − 1 ⎦ = ⎣ −1 0 0 ⎦
z 5c − z 6c z 7c − z 6c z 2c − z 6c 1−1 1−1 0−1 0 0 −1
(9.1.76)
⎡ c ⎤ ⎡ ⎤ ⎡ ⎤
x6 − x7c x4c − x7c x3c − x7c 1−0 0−0 0−0 1 0 0
W7 = ⎣ y6c − y7c y4c − y7c y3c − y7c ⎦ = ⎣ 1 − 1 0 − 1 1 − 1 ⎦ = ⎣ 0 −1 0 ⎦
z 6c − z 7c z 4c − z 7c z 3c − z 7c 1−1 1−1 0−1 0 0 −1
(9.1.77)
Then, the inverse matrices of the matrices Wl are calculated, respectively, as

follows:
⎡ ⎤ ⎡ ⎤
100 100
W0 = ⎣ 0 1 0 ⎦, W0−1 = ⎣0 1 0⎦ (9.1.78)
001 001
⎡ ⎤ ⎡ ⎤
0 −1 0 0 10
W1 = ⎣ 1 0 0 ⎦, W1−1 = ⎣ −1 0 0 ⎦ (9.1.79)
0 0 1 0 01
⎡ ⎤ ⎡ ⎤
−1 0 0 −1 0 0
W2 = ⎣ 0 −1 0 ⎦, W2−1 = ⎣ 0 −1 0 ⎦ (9.1.80)
0 0 1 0 0 1
⎡ ⎤ ⎡ ⎤
0 10 0 −1 0
W3 = ⎣ −1 0 0 ⎦, W3−1 = ⎣1 0 0⎦ (9.1.81)
0 01 0 0 1
⎡ ⎤ ⎡ ⎤
01 0 01 0
W4 = ⎣ 1 0 0 ⎦, W4−1 = ⎣1 0 0 ⎦ (9.1.82)
0 0 −1 0 0 −1
⎡ ⎤ ⎡ ⎤
−1 0 0 −1 0 0
W5 = ⎣ 0 1 0 ⎦, W5−1 =⎣ 0 1 0 ⎦ (9.1.83)
0 0 −1 0 0 −1
⎡ ⎤ ⎡ ⎤
0 −1 0 0 −1 0
W6 = ⎣ −1 0 0 ⎦, W6−1 = ⎣ −1 0 0 ⎦ (9.1.84)
0 0 −1 0 0 −1
⎡ ⎤ ⎡ ⎤
1 0 0 1 0 0
W7 = ⎣ 0 −1 0 ⎦, W7−1 = ⎣ 0 −1 0 ⎦ (9.1.85)
0 0 −1 0 0 −1
According to Eq. (9.1.57), the matrices Tl are calculated, respectively, as follows:

⎡ ⎤⎡ ⎤ ⎡ ⎤
100 100 100
T0 = = ⎣ 0 1 0 ⎦⎣ 0 1 0 ⎦ = ⎣ 0 1 0 ⎦
A0 W0−1 (9.1.86)
00z 001 00z
⎡ ⎤⎡ ⎤ ⎡ ⎤
0 −1 0 0 10 100
T1 = A1 W1−1 = ⎣ 1 0 0 ⎦⎣ −1 0 0 ⎦ = ⎣ 0 1 0 ⎦ (9.1.87)
0 0 1 0 01 001
⎡ ⎤⎡ ⎤ ⎡ ⎤
−1 0 0 −1 0 0 100
T2 = A2 W2−1 = ⎣ 0 −1 0 ⎦⎣ 0 −1 0 ⎦ = ⎣ 0 1 0 ⎦ (9.1.88)
0 0 1 0 0 1 001
⎡ ⎤⎡ ⎤ ⎡ ⎤
0 10 0 −1 0 100
T3 = A3 W3−1 = ⎣ −1 0 0 ⎦⎣ 1 0 0 ⎦ = ⎣ 0 1 0 ⎦ (9.1.89)
0 01 0 0 1 001
⎡ ⎤⎡ ⎤ ⎡ ⎤
0 1 0 01 0 1 0 0
T4 = A4 W4−1 = ⎣ 1 0 0 ⎦⎣ 1 0 0 ⎦ = ⎣ 0 1 0⎦ (9.1.90)
1 − z 1 − z −z 0 0 −1 1−z 1−z z
⎡ ⎤⎡ ⎤ ⎡ ⎤
−1 0 0 −1 0 0 1 00
T5 = A5 W5−1 = ⎣ 0 1 0 ⎦⎣ 0 1 0 ⎦ = ⎣ 0 1 0 ⎦ (9.1.91)
z − 1 0 −1 0 0 −1 1−z 0 1
⎡ ⎤⎡ ⎤ ⎡ ⎤
0 −1 0 0 −1 0 100
T6 = A6 W6 = ⎣ −1 0 0 ⎦⎣ −1 0 0 ⎦ = ⎣ 0 1 0 ⎦
−1
(9.1.92)
0 0 −1 0 0 −1 001
⎡ ⎤⎡ ⎤ ⎡ ⎤
1 0 0 1 0 0 1 0 0
T7 = A7 W7−1 = ⎣ 0 −1 0 ⎦⎣ 0 −1 0 ⎦ = ⎣ 0 1 0 ⎦ (9.1.93)
0 z − 1 −1 0 0 −1 0 1−z 1
Then, the inverse matrices of the matrices Tl are calculated, respectively, as

follows:
⎡ ⎤ ⎡ ⎤
100 100
T0 = ⎣ 0 1 0 ⎦, T0−1 = ⎣ 0 1 0 ⎦ (9.1.94)
00z 0 0 1z
⎡ ⎤ ⎡ ⎤
100 100
T1 = ⎣ 0 1 0 ⎦, T1−1 = ⎣ 0 1 0 ⎦ (9.1.95)
001 001
⎡ ⎤ ⎡ ⎤
100 100
T2 = ⎣ 0 1 0 ⎦, T2−1 = ⎣ 0 1 0 ⎦ (9.1.96)
001 001
⎡ ⎤ ⎡ ⎤
100 100
T3 = ⎣ 0 1 0 ⎦, T3−1 = ⎣ 0 1 0 ⎦ (9.1.97)
001 001
⎡ ⎤ ⎡ ⎤
1 0 0 1 0 0
T4 = ⎣ 0 1 0 ⎦, T4−1 = ⎣ 0 1 0 ⎦ (9.1.98)
1−z 1−z z z−1 z−1 1
z z z
⎡ ⎤ ⎡ ⎤
1 0 0 1 00
T5 = ⎣ 0 1 0 ⎦, T5−1 = ⎣ 0 1 0 ⎦ (9.1.99)
1−z 0 1 z−1 0 1
⎡ ⎤ ⎡ ⎤
10 0 100
T6 = ⎣ 0 1 0 ⎦, T6−1 = ⎣ 0 1 0 ⎦ (9.1.100)
00 1 001
⎡ ⎤ ⎡ ⎤
1 0 0 1 0 0
⎣
T7 = 0 1 0 ⎦, T7−1 = ⎣ 0 1 0 ⎦ (9.1.101)
0 1−z 1 0 z−1 1
According to Eq. (9.1.58), the condition numbers κl for all the nodes are calculated,
respectively, as follows:
√
∥ ∥ √ 1
κ0 = ∥T0 ∥ · ∥T0−1 ∥ = 2 + z 2 · 2+ (9.1.102)
z2
∥ ∥ √ √
κ1 = ∥T1 ∥ · ∥T1−1 ∥ = 3 · 3 (9.1.103)
∥ ∥ √ √
κ2 = ∥T2 ∥ · ∥T2−1 ∥ = 3 · 3 (9.1.104)
∥ ∥ √ √
κ3 = ∥T3 ∥ · ∥T3−1 ∥ = 3 · 3 (9.1.105)
√ )
(
∥ −1 ∥ √ z−1 2 1
κ4 = ∥T4 ∥ · ∥T4 ∥ = 2 + 2(1 − z)2 + z 2 · 2+2 + 2 (9.1.106)
z z
∥ ∥ √ √
κ5 = ∥T5 ∥ · ∥T5−1 ∥ = 3 + (1 − z)2 · 3 + (z − 1)2 (9.1.107)
∥ ∥ √ √
κ6 = ∥T6 ∥ · ∥T6−1 ∥ = 3 · 3 (9.1.108)
∥ −1 ∥ √ √
∥
κ7 = ∥T7 ∥ · T7 ∥ = 3 + (1 − z) · 3 + (z − 1)2
2
(9.1.109)
As seen from Eqs. (9.1.102)–(9.1.109), condition numbers κ1 , κ2 , κ3 , and κ6 do

not change in this case because the fourth node whose z-coordinate is to be changed
does not become the target node or its adjacent node for these values. Note that,
from Eqs. (9.1.102), (9.1.106), (9.1.107) and (9.1.109), it can be easily shown that
the minimum value of the condition numbers κ0 , κ4 , κ5 and κ7 is 3.
Figure 9.3 shows the values of the condition numbers κ0 , κ4 , κ5 , and κ7 and those
of AlgebraicShapeMetric f of the elements. The horizontal axis is the z-coordinate
of the fourth node, the left vertical scale the value of κl , and the right vertical scale
Fig. 9.3 Condition numbers

and algebraic shape metric
that of AlgebraicShapeMetric f . In the case of a perfect cubic shape, each condition

number κl has the minimum value of 3.0, and the value of AlgebraicShapeMetric f
the maximum value of 1.0.
Next, let’s take a look at ElementShape . c, a program code that calculates
the maximum and minimum values of the edge lengths, those of the edge angles,
and those of the face angles as another set of measures of the element shape of a
hexahedral element. The main variables and arrays are listed in Table 9.7.
/* ElementShape.c */
/*---------------------------------------------------*/
void check_shape(
double *s_data,
int *elem,
double **node,
int nfpn)
{
int i,j,k,i0,i1,i2,n,ne,nnpe=8;
double dl[12],da[48],de[24],e[12][3],d1,d2,
dx0[3],dx1[3],dx2[3],
nv[8][3][3],v1[3],v2[3],v3[3],v4[3],ndata[8][3],
dl_min,dl_max,da_min,da_max,de_max,de_min;
int idx[8][3] =
{{0,3,8},{1,0,9},{2,1,10},{3,2,11},
{7,4,8},{4,5,9},{5,6,10},{6,7,11}};
int sg[8][3] = {{ 1,-1, 1},{ 1,-1, 1},{ 1,-1, 1},{ 1,-1, 1},
{-1, 1,-1},{-1, 1,-1},{-1, 1,-1},{-1, 1,-1} };
int id[12][6]
= { {0,2,1, 1,0,1}, {1,2,1, 2,0,1},
Table 9.7 Variables and arrays used in ElementShape.c

dl[i] double, 1D array The length of the i-th edge
e[i][3] double, 2D array The unit vector along the i-th edge
dx0[3] double, 1D array The unit vector along the first edge of the node, using the edge
ordering given in idx [][ ]
dx1[3] double, 1D array The unit vector along the second edge of the node, using the
edge ordering given in idx [][ ]
dx2[3] double, 1D array The unit vector along the third edge of the node, using the edge
ordering given in idx [][ ]
nv[i][j][3] double, 3D array The outward unit normal vector of the j-th face around the i-th
node
v1[3] double, 1D array For each terminal node of an edge, two outward unit normal
v2[3] double, 1D array vectors are defined, each for one of the two faces that shares the
edge. v1 and v3 are normal vectors of one face, and v2 and v4
v3[3] double, 1D array of the other face
v4[3] double, 1D array
ndata[8][3] double, 2D array Coordinate values of eight nodes in an element
idx[i][j] int, 2D array The number of the j-th edge for the i-th node
sg[i][j] int, 2D array The direction of the j-th vector of the i-th node
id[i][j] int, 2D array For the two faces that share the i-th edge, this array indicates
which normal vectors should be checked
{2,2,1, 3,0,1},{3,2,1, 0,0,1},

{4,1,0, 5,1,2}, {5,1,0, 6,1,2},
{6,1,0, 7,1,2}, {7,1,0, 4,1,2},
{0,0,2, 4,2,0}, {1,0,2, 5,2,0},
{2,0,2, 6,2,0},{3,0,2, 7,2,0}};
k = elem[i] ;
for(j=0;j<nfpn;j++) ndata[i][j] = node[k][j] ;
}
/*----------- length ----------------*/
for(i=0;i<=3;i++){
for(j=0;j<nfpn;j++)
e[i][j] = ndata[(i+1)%4][j] - ndata[i][j] ;
for(j=0,d1=0.0;j<nfpn;j++) d1 += e[i][j]*e[i][j] ;
dl[i] = sqrt(d1) ;
for(j=0;j<nfpn;j++) e[i][j] /= dl[i] ;
}
for(i=4;i<=7;i++){
for(j=0;j<nfpn;j++)
e[i][j] = ndata[(i+1)%4+4][j] - ndata[i][j] ;
for(j=0,d1=0.0;j<nfpn;j++) d1 += e[i][j]*e[i][j] ;
dl[i] = sqrt(d1) ;
for(j=0;j<nfpn;j++) e[i][j] /= dl[i] ;
}
for(i=0;i<=3;i++){
for(j=0;j<nfpn;j++)
e[i+8][j] = ndata[i+4][j] - ndata[i][j] ;
for(j=0,d1=0.0;j<nfpn;j++) d1 += e[i+8][j]*e[i+8][j] ;
dl[i+8] = sqrt(d1) ;
for(j=0;j<nfpn;j++) e[i+8][j] /= dl[i+8] ;
}
/*------------ angle1 ----------------*/
for(n=0;n<nnpe;n++){
for(i=0;i<nfpn;i++) dx0[i] = sg[n][0]*e[idx[n][0]][i] ;
de[n*3+0] = dx0[0]*dx1[0]+dx0[1]*dx1[1]+dx0[2]*dx1[2] ;
nv[n][0][0] = dx2[1]*dx1[2] - dx2[2]*dx1[1] ;
nv[n][0][1] = dx2[2]*dx1[0] - dx2[0]*dx1[2] ;
nv[n][0][2] = dx2[0]*dx1[1] - dx2[1]*dx1[0] ;
nv[n][1][0] = dx1[1]*dx0[2] - dx1[2]*dx0[1] ;
nv[n][1][1] = dx1[2]*dx0[0] - dx1[0]*dx0[2] ;
nv[n][1][2] = dx1[0]*dx0[1] - dx1[1]*dx0[0] ;
nv[n][2][0] = dx0[1]*dx2[2] - dx0[2]*dx2[1] ;
nv[n][2][1] = dx0[2]*dx2[0] - dx0[0]*dx2[2] ;
nv[n][2][2] = dx0[0]*dx2[1] - dx0[1]*dx2[0] ;
for(j=0;j<3;j++){
for(i=0,d1=0.0;i<nfpn;i++) d1 += nv[n][j][i]*nv[n][j][i] ;
d2 = 1.0/sqrt(d1) ;
for(i=0;i<nfpn;i++) nv[n][j][i] *= d2 ;
}
}
/*------------ angle2 ----------------*/
for(ne=0;ne<12;ne++){
for(i=0;i<nfpn;i++) v1[i] = nv[id[ne][0]][id[ne][1]][i] ;
da[ne*4+0] = v1[0]*v2[0]+v1[1]*v2[1]+v1[2]*v2[2] ;
da[ne*4+1] = v3[0]*v4[0]+v3[1]*v4[1]+v3[2]*v4[2] ;
da[ne*4+2] = v1[0]*v4[0]+v1[1]*v4[1]+v1[2]*v4[2] ;
da[ne*4+3] = v2[0]*v3[0]+v2[1]*v3[1]+v2[2]*v3[2] ;
}
/*------------minmax-----------------------*/
dl_max = -1.0 ;
dl_min = 1.0e30 ;
for(i=0;i<12;i++){
if(dl[i] > dl_max) dl_max = dl[i] ;
if(dl[i] < dl_min) dl_min = dl[i] ;
}
de_max = -1.0e30 ;
de_min = 1.0e30 ;
for(i=0;i<24;i++){
if(de[i] > de_max) de_max = de[i] ;
if(de[i] < de_min) de_min = de[i] ;
}
da_max = -1.0e30 ;
da_min = 1.0e30 ;
for(i=0;i<48;i++){
if(da[i] > da_max) da_max = da[i] ;
if(da[i] < da_min) da_min = da[i] ;
}
s_data[0] = dl_min ;
s_data[1] = dl_max ;
s_data[2] = acos(da_min)*180/3.1415926 ;
s_data[3] = acos(da_max)*180/3.1415926 ;
s_data[4] = acos(de_min)*180/3.1415926 ;
s_data[5] = acos(de_max)*180/3.1415926 ;
}
/*---------------------------------------------------*/
The program code above is structured as follows:

Calculation of the lengths of the edges (length part)
Calculation of the angles between edges (angle1 part)
Calculation of the angles between faces (angle2 part)
Calculation of the maximum and minimum values (minmax part).
First, in the length part, the length of each edge and the unit vector e[i] along
each edge are calculated. The directions of the unit vectors are shown in Fig. 9.4.
For example, e[4] is the unit vector from the fourth node to the fifth node.
Next, in the angle1 part, the angle between edges is calculated. As shown in
Fig. 9.5, unit vectors dx0[i], dx1[i], and dx2[i] are defined at a node, each
of which starts from the node and directs along the edge. The array idx[i][j]
Fig. 9.4 Unit vector along 7

e[7]
each edge e[6]
4
e[11] 6
e[5]
e[8] 5
e[4]
e[10]
3 e[2]
0 e[3]
e[9]
2
e[0]
e[1]
1
Fig. 9.5 Unit vectors at dx0[4]

each node
dx2[4] dx0[5]
dx1[4]
dx1[5]
dx2[5]
nv[4][1]
Fig. 9.6 Outward normal nv[4][2]
vectors of three faces that
share a node nv[5][1]
nv[4][0]
nv[5][0]
nv[5][2]
contains the correspondence between the unit vector at each node and that along each
edge, and the array sg[i][j] the orientation of the unit vector.
For example, at the node 4, idx[4]={7,4,8} and sg[4]={-1,1,-1}
define the unit vectors at the node as dx0[4]=-e[7], dx1[4]=e[4], and
dx2[4]=-e[8]. Once the unit vectors at each node are defined, the angles between
the three unit vectors at the node are calculated based on the cosine of the angle
between two vectors. Thus, a total of 24 angles between edges are calculated.
In addition, using the three unit vectors at each node, the outward unit normal
vectors nv [ i ][ j] of the three faces that share the node (Fig. 9.6) are obtained
from the outer product of the unit vectors in the face.
Next, in the angle2 part, the angle between the faces of the hexahedron is calcu-
lated. Each face of a hexahedral element is not necessarily flat, i.e., it is not guaranteed
that the four nodes of a face lie on the same plane. For this reason, it is difficult to
show the angle between the faces with a single value; therefore, four angle values
are calculated between two faces of an element instead.
Figure 9.7 shows how to express the angle between two faces that share an edge
(red) in terms of the four angles formed by triangles composed of nodes on each face
(it is obvious that the three points of a triangle are on the same plane). Since each
triangle has two sides as element edges, its normal vector is one of the nv [ i ][
j] calculated in the angle1 part. The array id [ i ][ j] contains for each edge the
corresponding normal vector of the triangle shown in Fig. 9.7.
For example, for the fourth edge, id[4]={4,1,0, 5,1,2} indicates that
the normal vectors nv[4][1], nv[4][0] at the fourth node and nv[5][1],
nv[5][2] at the fifth node are those used to calculate the angle between the two
Fig. 9.7 Angle between two faces
faces that share the fourth edge, as shown in Fig. 9.6. Thus, a total of 48 angles are
calculated for all the 12 edges.
Finally, in the minmax part, the maximum and minimum values are calculated for
each parameter. Then, each value related to some angle is converted from the cosine
value to an angle value to make it easier to understand intuitively.
9.1.3 B-Spline and NURBS
A method for applying deep learning to the contact search between surfaces defined
by NURBS is discussed in Chap. 6. NURBS, which has been used for defining shapes
in CAD, is also used as the basis function in the isogeometric analysis [1, 6]. In this
subsection, a program code for computing NURBS basis functions is given.
NURBS basis functions are created from B-spline basis functions. Let’s take a
look at bspline.c, a program that calculates the B-spline basis functions. The
main variables and arrays used are listed in Table 9.8.
/* bspline.c */
#define NMAX_KV 100
#define KV_EPS 1.0e-5
#define NURBS_EPS 1.0e-20
Table 9.8 Variables and arrays used in bspline.c

N[i] double, 1D array The i-th B-spline basis function Ni (ξ )
M[j] double, 1D array The j-th B-spline basis function M j (η)
L[k] double, 1D array The k-th B-spline basis function L k (ζ )
dN[i] double, 1D array The first derivative of the i-th B-spline basis function Ni (ξ )
dM[j] double, 1D array The first derivative of the j-th B-spline basis function M j (η)
dL[k] double, 1D array The first derivative of the k-th B-spline basis function L k (ζ )
n int Number of basis functions Ni (ξ )
m int Number of basis functions M j (η)
l int Number of basis functions L k (ζ )
KV[] double, 1D array Knot vector
nkv int Dimension of the knot vector
p int The polynomial order of the B-spline basis functions
ni int KV[ni] < ξ < KV[ni + 1]
xi double ξ
et double η
ze double ζ
B[i][j][k] double, 3D array 3D B-spline functions Bi, j,k (ξ, η, ζ )
∂ Bi, j,k (ξ,η,ζ )
dB_xi[i][j][k] double, 3D array ∂ξ
∂ Bi, j,k (ξ,η,ζ )
dB_et[i][j][k] double, 3D array ∂η
∂ Bi, j,k (ξ,η,ζ )
dB_ze[i][j][k] double, 3D array ∂ζ
/*---------------------------------------------------*/
void Bspline00(
double *N,
double *dN,
double xi,
int ni,
int p,
int nkv,
double *KV)
{
int i,j,k;
double d,e,d_d,d_e,temp[NMAX_KV],dtemp[NMAX_KV],dw;
for(i=ni-p;i<=ni+p;i++){
temp[i] = 0.0 ;
dtemp[i] = 0.0 ;
}
for(i=ni-p;i<=ni;i++){
if((xi>=KV[i]) && (xi<KV[i+1])){
temp[i] = 1.0;
}else{
temp[i] = 0.0;
}
dtemp[i] = 0.0 ;
}
for(k=1;k<=p;k++){
for(i=ni-k;i<=ni;i++){
if(fabs(temp[i]) > NURBS_EPS){
dw = KV[i+k] - KV[i] ;
if(fabs(dw) > KV_EPS){
d = ((xi - KV[i])*temp[i])/dw ;
d_d = k*temp[i]/dw ;
}else{
d = 0.0 ;
d_d = 0.0 ;
}
}else{
d = 0.0;
d_d = 0.0 ;
}
if(fabs(temp[i+1]) > NURBS_EPS){
dw = KV[i+k+1] - KV[i+1] ;
if(fabs(dw) > KV_EPS){
e = ((KV[i+k+1] - xi)*temp[i+1])/dw ;
d_e = k*temp[i+1]/dw ;
}else{
e = 0.0 ;
d_e = 0.0 ;
}
}else{
e = 0.0;
d_e = 0.0 ;
}
temp[i] = d + e;
dtemp[i] = d_d - d_e;
}
}
for(i=0;i<=p;i++){
N[i] = temp[ni-p+i] ;
dN[i] = dtemp[ni-p+i] ;
}
}
/*---------------------------------------------------*/
void Bspline1D(
double *N,
double *dN,
int n,
double xi,
int p,
int nkv,
double *KV)
{
int i,j,k,nb;
double d1,d2,d3;
for(i=0;i<n;i++){
N[i] = 0.0 ;
dN[i] = 0.0 ;
}
for(i=0;i<nkv;i++)
if((xi >= KV[i]) && (xi < KV[i+1])) break ;
Bspline00(N+(i-p),dN+(i-p),xi,i,p,nkv,KV) ;
}
/*---------------------------------------------------*/
void Bspline3D(
double ***B,
double ***dB_xi,
double ***dB_et,
double ***dB_ze,
double *N,
double *dN,
int n,
double *M,
double *dM,
int m,
double *L,
double *dL,
int l)
{
int i,j,k;
double d1,d2,d3,dd1,dd2,dd3;
for(i=0;i<n;i++){
d1 = N[i] ;
dd1 = dN[i] ;
for(j=0;j<m;j++){
d2 = M[j] ;
dd2 = dM[j] ;
for(k=0;k<l;k++){
d3 = L[k] ;
dd3 = dL[k] ;
B[i][j][k] = d1*d2*d3;
dB_xi[i][j][k] = dd1*d2*d3;
dB_et[i][j][k] = d1*dd2*d3;
dB_ze[i][j][k] = d1*d2*dd3;
}
}
}
}
/*---------------------------------------------------*/
{ }
For a knot vector Ξ = ξ1 , ξ2 , ξ3 , · · · , ξn+ p , ξn+ p+1 , a monotonically non-
decreasing sequence of real numbers, the n one-dimensional p-th order B-spline
basis functions Ni, p (ξ ) are defined as [12, 15]

{
1 (ξi ≤ ξ < ξi+1 )
Ni,0 (ξ ) = (9.1.110)
0 (otherwise)
ξ − ξi ξi+ p+1 − ξ
Ni, p (ξ ) = Ni, p−1 (ξ ) + Ni+1, p−1 (ξ ) (9.1.111)
ξi+ p − ξi ξi+ p+1 − ξi+1
Here, the rule 0/0 = 0 is applied to the fractional part of Eq. (9.1.111). (See
Chapter 6 for an example of calculation using Eqs. (9.1.110) and (9.1.111).)
Note an open knot vector with ξ1 = ξ2 = · · · = ξ p = ξ p+1 and ξn+1 = ξn+2 =
· · · = ξn+ p = ξn+ p+1 is usually employed for CAD and the isogeometric analysis.
Differentiating Eq. (9.1.111), we have Eq. (9.1.112), showing that the derivative of
the B-spline basis function of the p-th order can be obtained from the basis functions
of the (p−1)-th order and their derivatives. In other words, as well as the B-spline
basis function of the p-th order, its derivative can be calculated recursively by Eq.
(9.1.112).
dNi, p (ξ ) 1 ξ − ξi dNi, p−1 (ξ )

= Ni, p−1 (ξ ) +
dξ ξi+ p − ξi ξi+ p − ξi dξ
−1 ξi+ p+1 − ξ dNi+1, p−1 (ξ )
+ Ni+1, p−1 (ξ ) +
ξi+ p+1 − ξi+1 ξi+ p+1 − ξi+1 dξ
(9.1.112)
For the first-order and the higher-order derivatives, we have, respectively, the
following formulas [12].
dNi, p (ξ ) p p
= Ni, p−1 (ξ ) − Ni+1, p−1 (ξ ) (9.1.113)
dξ ξi+ p − ξi ξi+ p+1 − ξi+1
dk Ni, p (ξ ) p dk−1 Ni, p−1 (ξ ) p dk−1 Ni+1, p−1 (ξ )

= −
dξ k ξi+ p − ξi dξ k−1 ξi+ p+1 − ξi+1 dξ k−1
(9.1.114)
Similarly, expressing the higher-order derivative as a linear combination of basis

functions, we can derive
dk Ni, p (ξ ) p! Σ k
= ak, j Ni+ j, p−k (ξ ) (9.1.115)
dξ k ( p − k)! j=0
where
a0,0 = 1
ak−1,0
ak,0 = ξi+ p−k+1 −ξi
ak−1, j −ak−1, j−1 (9.1.116)
ak, j = ξi+ p+ j−k+1 −ξi+ j (0 < j < k)
−ak−1,k−1
ak,k = ξi+ p+1 −ξi+k
In the code bspline.c above, the function Bspline1D calculates the n B-

spline basis functions of the p-th order N1, p (ξ0 ), N2, p (ξ0 ), · · · , Nn, p (ξ0 ) and their
dN (ξ ) dN (ξ ) dN (ξ )
first-order derivatives 1,dξp 0 , 2,dξp 0 , · · · , n,dξp 0 for some ξ0 given. The function
Bspline00 is responsible for the main body of the reccurrence formula.
The two- and three-dimensional B-spline basis functions are defined as the
product of the one-dimensional B-spline basis functions, which is written in the
three-dimensional case as
p,q,r
Ni, j,k (ξ, η, ζ ) = Ni, p (ξ ) · M j,q (η) · L k,r (ζ ) (9.1.117)
And its partial derivatives are written as follows:

p,q,r
∂ Ni, j,k (ξ, η, ζ ) ∂ Ni, p (ξ )
= · M j,q (η) · L k,r (ζ ) (9.1.118)
∂ξ ∂ξ
p,q,r
∂ Ni, j,k (ξ, η, ζ ) ∂ Mi,q (η)
= Ni, p (ξ ) · · L k,r (ζ ) (9.1.119)
∂η ∂η
p,q,r
∂ Ni, j,k (ξ, η, ζ ) ∂ L i,r (ζ )
= Ni, p (ξ ) · M j,q (η) · (9.1.120)
∂ζ ∂ζ
The three-dimensional B-spline basis functions and their partial derivatives are
computed by calling Bspline3D after computing the three one-dimensional B-
spline basis functions Ni, p (ξ ), M j,q (η), and L k,r (ζ ) by calling Bspline1D three
times.
Next, let’s take a look at nurbs.c below, which calculates the NURBS basis
functions. Main variables and arrays are summarized in Table 9.9. Many of these are
the same as in bspline.c (see Table 9.8).
/* nurbs.c */
/*---------------------------------------------------*/
void Nurbs1D(
double *R,
double *dR,
int n,
double *N,
double *dN,
double xi,
Table 9.9 Variables and arrays used in nurbs.c

R[i] double, 1D array Ri (ξ ) = ΣnNi (ξ )·wi
Nî (ξ )·wî
î=1
d Ri (ξ )
dR[i] double, 1D array dξ
w[] double, 1D array Weight vector
Ri, j,k (ξ, η, ζ )

R3[i][j][k] double, 3D array Ni (ξ ) · M j (η) · L k (ζ ) · wi, j,k
= Σn Σm Σl
î=1 ĵ=1
N (ξ ) · M ĵ (η) · L k̂ (ζ ) · wî , ĵ,k̂
k̂=1 î
∂ Ri, j,k (ξ,η,ζ )
dR3_xi[i][j][k] double, 3D array ∂ξ
∂ Ri, j,k (ξ,η,ζ )
dR3_et[i][j][k] double, 3D array ∂η
∂ Ri, j,k (ξ,η,ζ )
dR3_ze[i][j][k] double, 3D array ∂ζ
int p,
int nkv,
double *KV,
double *w)
{
int i;
double dsum,dsum2,ddsum;
Bspline1D( N, dN, n, xi, p, nkv, KV) ;
for(i=0,dsum=0.0;i<n;i++) dsum += N[i]*w[i] ;
dsum2 = 1.0/dsum/dsum ;
for(i=0;i<n;i++) R[i] = N[i]*w[i]/dsum ;
for(i=0,ddsum=0.0;i<n;i++) ddsum += dN[i]*w[i] ;
for(i=0;i<n;i++)
dR[i] = (dN[i]*w[i]*dsum - N[i]*w[i]*ddsum)*dsum2 ;
}
/*---------------------------------------------------*/
void Nurbs3D(
double ***R3,
double ***dR3_xi,
double ***dR3_et,
double ***dR3_ze,
double *N,
double *dN,
int n,
double *M,
double *dM,
int m,
double *L,
double *dL,
int l,
double *w1,
double *w2,
double *w3)
{
int i,j,k;
double d1,d2,d3,dd1,dd2,dd3,dsum,dsum2,ddsum1,ddsum2,
ddsum3;
for(i=0,dsum=0.0,ddsum1=0.0,ddsm2=0.0,ddsum3=0.0;
i<n;i++){
d1 = N[i]*w1[i] ;
dd1 = dN[i]*w1[i] ;
for(j=0;j<m;j++){
d2 = M[j]*w2[j] ;
dd2 = dM[j]*w2[j] ;
for(k=0;k<l;k++){
d3 = L[k]*w3[k] ;
dd3 = dL[k]*w3[k] ;
dsum += d1*d2*d3 ;
ddsum1 += dd1*d2*d3;
ddsum2 += d1*dd2*d3;
ddsum3 += d1*d2*dd3;
}
}
}
dsum2 = 1.0/dsum/dsum ;
for(i=0;i<n;i++){
d1 = N[i]*w1[i] ;
dd1 = dN[i]*w1[i] ;
for(j=0;j<m;j++){
d2 = M[j]*w2[j] ;
dd2 = dM[j]*w2[j] ;
for(k=0;k<l;k++){
d3 = L[k]*w3[k] ;
dd3 = dL[k]*w3[k] ;
R3[i][j][k] = d1*d2*d3/dsum ;
dR3_xi[i][j][k] = (dd1*dsum - d1*ddsum1)*dsum2;
dR3_et[i][j][k] = (dd2*dsum - d2*ddsum2)*dsum2;
dR3_ze[i][j][k] = (dd3*dsum - d3*ddsum3)*dsum2;
}
}
}
}
/*---------------------------------------------------*/
Using the one-dimensional B-spline basis functions and the weight vector, the
one-dimensional NURBS basis function is defined by (see also Sect. 6.2)
Ni, p (ξ ) · wi
Ri, p (ξ ) = Σn (9.1.121)
N (ξ ) · wî
î=1 î, p
Its first-order derivative is written as

(Σ ) (Σ dNî, p (ξ )
)
dNi, p (ξ ) n n
dRi, p (ξ ) dξ
· wi · N (ξ ) · wî − Ni, p (ξ ) · wi ·
î=1 î, p î=1 dξ
· wî
= (Σ )2
dξ n
N
î=1 î , p
(ξ ) · wî
(9.1.122)
In the code nurbs.c above, the function Nurbs1D takes as input a knot vector
and a weight vector. First, the B-spline basis functions and their first-order derivatives
are computed from the knot vector by calling the Bspline1D function, and then
the NURBS basis functions and their first-order derivatives are computed using Eqs.
(9.1.121) and (9.1.122).
The two- and three-dimensional NURBS basis functions are obtained as simple
extensions of Eq. (9.1.121). The latter is given by
p,q,r Ni, p (ξ ) · M j,q (η) · L k,r (ζ ) · wi, j,k

Ri, j,k (ξ, η, ζ ) = Σn Σm Σl (9.1.123)
î=1 ĵ=1
N (ξ ) · M ĵ,q (η) · L k̂,r (ζ ) · wî, ĵ ,k̂
k̂=1 î, p
where wi, j,k = wi · w j · wk , the product of the weights in each axis direction, and
the first-order derivatives (partial derivatives) of the three-dimensional NURBS basis
function with respect to ξ, η, and ζ are, respectively, written as follows:
p,q,r
∂ Ri, j,k (ξ, η, ζ ) 1
=( )2
∂ξ Σ
n Σ
m Σ
l
Nı̂, p (ξ ) · Mjˆ,q (η) · L k̂,r (ζ ) · wı̂,jˆ,k̂
ı̂=1 ĵ=1 k̂=1
{
∂ Ni, p (ξ )
× · M j,q (η) · L k,r (ζ )
∂ξ
⎛ ⎞
Σ n Σ m Σ l
· wi, j,k ⎝ Nı̂, p (ξ ) · Mjˆ,q (η) · L k̂,r (ζ ) · wı̂,jˆ,k̂ ⎠
ı̂=1 jˆ=1 k̂=1
− Ni, p (ξ ) · M j,q (η) · L k,r (ζ )

⎛ ⎞⎫
Σ
n Σ m Σ l
∂ Nı̂, p (ξ ) ⎬
·wi, j,k ⎝ · Mjˆ,q (η) · L k̂,r (ζ ) · wı̂,jˆ,k̂ ⎠
∂ξ ⎭
ı̂=1 jˆ=1 k̂=1
(9.1.124)
p,q,r
∂ Ri, j,k (ξ, η, ζ ) 1
=( )2
∂η Σ
n Σ
m Σ
l
Nı̂, p (ξ ) · Mjˆ,q (η) · L k̂,r (ζ ) · wı̂,jˆ,k̂
î=1 jˆ=1 k̂=1
9.2 Computer Programming for Training Phase 325
{
∂ M j,q (η)
× Ni, p (ξ ) · · L k,r (ζ )
∂η
⎛ ⎞
Σ
n Σm Σ l
î=1 ĵ=1 k̂=1
− Ni, p (ξ ) · M j,q (η) · L k,r (ζ )

⎛ ⎞⎫
Σ
n Σ m Σ l
∂ Mjˆ,q (η) ⎬
·wi, j,k ⎝ Nı̂ , p (ξ ) · · L k̂,r (ζ ) · wı̂,jˆ,k̂ ⎠
∂η ⎭
ı̂=1 jˆ=1 k̂=1
(9.1.125)
p,q,r
∂ Ri, j,k (ξ, η, ζ ) 1
=( )2
∂ζ Σ
n Σ
m Σ
l
Nı̂, p (ξ ) · Mjˆ,q (η) · L k̂,r (ζ ) · wî ,jˆ,k̂
i=1 j=1 k̂=1
{
∂ L k,r (ζ )
× Ni, p (ξ ) · M j,q (η) ·
∂ζ
⎛ ⎞
Σ
n Σm Σ l
i=1 j=1 k̂=1
− Ni, p (ξ ) · M j,q (η) · L k,r (ζ )

⎛ ⎞⎫
Σn Σ m Σ l
∂ L k̂,r (ζ ) ⎬
·wi, j,k ⎝ Nı̂ , p (ξ ) · Mjˆ,q (η) · · wı̂,jˆ,k̂ ⎠
∂ζ ⎭
i=1 j=1 k̂=1
(9.1.126)
The function Nurbs3D is a function for computing three-dimensional NURBS

basis functions. It, taking as input the weight vectors and the precomputed B-spline
basis functions in each axis and using Eqs. (9.1.123)–(9.1.126), computes the three-
dimensional NURBS basis functions and their first-order partial derivatives.
9.2 Computer Programming for Training Phase
To understand deep learning, it is important to understand the related mathematical

formulas. But, it is also essential to actually run the programs and study results.
Here, we show a simple example of program to help the reader get started with deep
learning.
Section 9.2.1 describes an example of the simplest implementation of a feed-
forward neural network in C, while Sect. 9.2.2 shows an example of extending the
program using the general-purpose BLAS library. Both of these are written with the
background mathematical equations in detail. Section 9.2.3 describes an implemen-

tation of a feedforward neural network using the Python language, which is now
the most important language for deep learning, and Sect. 9.2.4 an implementation
of a convolutional neural network as well, focusing on the usage of deep learning
libraries.
9.2.1 Sample Code for Feedforward Neural Networks in C

Language
In Chap. 2, the behavior of neural networks has been explained in detail by using
mathematical formulas, which will be helpful for the readers to understand the
program given here.
Now, DLneuro.c, a simple program for a fully connected feedforward neural
network in C language, is studied. The program employs the SGD (Stochastic
Gradient Descent), which performs the error back propagation for each training
pattern, and the momentum method (see Sect. 2.3.1) to accelerate the training. It also
performs data augmentation by adding noise to the input data during training. Main
variables and arrays used in DLneuro.c are listed in Table 9.10. Note that, for the
sake of brevity, various additional processings such as error handling routines for
wrong usage have been omitted from DLneuro.c.
DLneuro . c assumes the input data file (text file) as follows:
1 0.11 0.31 0.12 0.70 0.25 0.52 0.20 0.91

2 0.10 0.20 0.84 0.69 0.40 0.43 0.85 0.10
3 0.32 0.55 0.21 0.03 0.51 0.18 0.15 0.34
.....
1000 0.27 0.80 0.90 0.22 0.37 0.35 0.15 0.73
The input data above are for the case of 5 input data (parameters) and 3 output
(teacher) data. The total number of patterns, including training patterns and veri-
fication patterns, is 1000. Each row corresponds to a pattern: the first column is a
sequential number, columns 2–6 the input data, and columns 7–9 the teacher data.
The input and the teacher data are both assumed to be single-precision real values.
First, let’s take a look at DLcommon.c below, which contains commonly used
functions such as those for file input and activation functions.
/* DLcommon.c */
/*---------------------------------------------------*/
void s_shuffle(
int *a,
Table 9.10 Variables and constants in DLneuro.c

nIU int Number of units in the input layer
nOU int Number of units in the input layer
nHL int Number of hidden layers
nHU[i] int, 1D array Number of units in the i-th hidden layer
MaxPattern int Number of patterns (Training + Test)
lp_no int Number of training patterns
tp_no int Number of test patterns
MaxEpochs int Number of epochs
Wmin float All the weights are initially set in the range
Wmax float All the weights are initially set in the range
Alpha float Training coefficient for the weights
Beta float Training coefficient for the biases
Mom1 float The coefficient in the momentum method for the weight update
Mom2 float The coefficient in the momentum method for the bias update
NoiseLevel float Amount of noise used for data augmentation
uOU[i] float, 1D array Input to the i-th unit in the output layer
zOU[i] float, 1D array Output from the i-th unit in the output layer
zdOU[i] float, 1D array First-order derivative of zOU[i] with respect to uOU[i]
uHU[i][j] float, 2D array Input to the j-th unit in the i-th hidden layer
zHU[i][j] float, 2D array Output from the j-th unit in the i-th hidden layer
zdHU[i][j] float, 2D array First-order derivative of zHU[i][j] with respect to uHU[i][j]
zIU[i] float, 1D array Output from the i-th unit in the input layer for a training pattern
zIUor[i][j] float, 2D array Output from the j-th unit in the input layer for the i-th training
pattern. (Input data)
t[i][j] float, 2D array Teacher signal of the i-th input pattern for the j-th output unit
w[i][j][k] float, 3D array Weight between j-th unit in the i + 1-layer and k-th unit in the
i-th layer
w_min[i][j][k] float, 3D array Weight between j-th unit in the i + 1-layer and k-th unit in the
i-th layer, for the minimum test error
dw[i][j][k] float, 3D array Weight update for w [ i ][ j ][ k]
bias[i][j] float, 2D array Bias of the j-th unit in the i-th layer
bias_min[i][j] float, 2D array Bias of the j-th unit in the i-th layer, for the minimum test error
dbias[i][j] float, 2D array Bias update for bias [ i ][ j]
dtemp[i][j] float, 2D array Temporary data used in back_propagation
int n)
{
int i,ic,tsize,itemp;
for(i=0,tsize=n;i<n-1;i++,tsize--){
ic = floor(drand48()*(tsize-1) + 0.1) ;
itemp = a[tsize-1] ;
a[tsize-1] = a[ic] ;
a[ic] = itemp ;
}
}
/*---------------------------------------------------*/
void a0f(
float *fv,
float *fvd,
float x)
{
float dd;
dd = (1.0f+(float)tanh(x/2.0f))/2.0f;
*fv = dd;
*fvd = dd*(1.0 - dd) ;
}
/*---------------------------------------------------*/
void a1f(
float *fv,
float *fvd,
float x)
{
float dd;
*fv = dd;
*fvd = dd*(1.0 - dd) ;
}
/*---------------------------------------------------*/
void read_file(
char *name,
float **o,
float **t,
int nIU,
int nOU,
int npattern)
{
int i,j,k;
FILE *fp;
fp = fopen( name, "r" ) ;
for(i=0;i<npattern;i++){
fscanf(fp,"%d",&k);
for(j=0;j<nIU;j++) fscanf(fp,"%e",o[i]+j);
for(j=0;j<nOU;j++) fscanf(fp,"%e",t[i]+j);
}
fclose( fp );
}
/*---------------------------------------------------*/
void initialize(
float ***w,
float **bias,
int nIU,
int *nHU,
int nOU,
int nHL)
{
int i,j,k;
for(i=0;i<=nHL;i++)
for(j=0;j<nHU[i+1];j++)
for(k=0;k<nHU[i];k++) w[i][j][k] = rnd() ;
for(j=1;j<=nHL+1;j++)
for(i=0;i<nHU[j];i++) bias[j][i] = rnd();
}
/*---------------------------------------------------*/
void store_weight(
float ***w,
float **bias,
float ***w_min,
float **bias_min,
int nIU,
int *nHU,
int nOU,
int nHL)
{
int i,j,k;
for(i=0;i<=nHL;i++)
for(k=0;k<nHU[i];k++) w_min[i][j][k] = w[i][j][k] ;
for(i=0;i<nHU[j];i++) bias_min[j][i] = bias[j][i] ;
}
/*---------------------------------------------------*/
void show_results(
float ***w,
float **bias,
float ***w_m,
float **bias_m,
int nIU,
int *nHU,
int nOU,
int nHL)
{
int i,j,k,iL;
for(iL=0;iL<=nHL;iL++){
for(i=0;i<nHU[iL];i++){
printf("%5d",i);
for(j=0;j<nHU[iL+1];j++)
printf(" %e",w_m[iL][j][i]);
printf("\n");
}
}
for(iL=1;iL<=nHL+1;iL++){
for(j=0;j<nHU[iL];j++) printf("%e ",bias_m[iL][j]);
printf("\n");
}
printf("%5d",i);
printf(" %e",w[iL][j][i]);
printf("\n");
}
}
for(j=0;j<nHU[iL];j++) printf("%e ",bias[iL][j]);
printf("\n");
}
}
/*---------------------------------------------------*/
void clear_dweight(
float ***dw,
float **dbias,
int nIU,
int *nHU,
int nOU,
int nHL)
{
int i,j,k;
for(i=0;i<=nHL;i++)
for(k=0;k<nHU[i];k++) dw[i][j][k] = 0.0 ;
for(i=0;i<nHU[j];i++) dbias[j][i] = 0.0 ;
}
/*---------------------------------------------------*/
In the code above, the function s_shuffle takes as input an array a[] of
integers from 0 to n−1. When this function is executed, the original array a[] is
randomly reordered. In other words, it is a function to generate permutations and is
used to shuffle the training patterns each epoch during training of a neural network.
The function a0f is the activation function used in the hidden layers, and the
function a1f is that used in the output layer. Each of them outputs both the function
value and the first-order derivative of the activation function for a given input x.
Here, both functions are assumed to be sigmoid functions.
The function read_file is used to read the input data, such as the example
shown above. Given the total number of patterns (npattern), the number of input
data (parameters) (nIU), the number of teacher data (nOU), and the input data file
name (name[]), this function reads the data from the file and stores them in arrays.
The function initialize is a function that initializes all the connection weights
and biases with random numbers. In this case, they are initialized with uniform
random numbers.
The function store_weight copies the connection weights and biases to
another arrays. It is used to store the connection weights and biases that minimize
the error for the verification patterns.
The function show_results outputs the connection weights and biases after
the training of the neural network has been completed.
The function clear_dweight clears the amount of updates of the connection
weights and biases to zero.
Next, let’s take a look at DLebp.c, which contains functions for the forward and
backward propagation, the core part of DLneuro.c.
/* DLebp.c */
/*---------------------------------------------------*/
void propagation(
int p,
float **zIU,
float **zHU,
float **zdHU,
float *zOU,
float *zdOU,
float ***w,
float **bias,
int nIU,
int *nHU,
int nOU,
int nHL)
{
int i,j,iL;
float net;
for(i=0;i<nHU[1];i++){
for(net=0,j=0;j<nIU;j++)
net += w[0][i][j] * zIU[p][j];
net += bias[1][i];
a0f(zHU[1]+i, zdHU[1]+i, net) ;
}
for(net=0,j=0;j<nHU[iL-1];j++)
net += w[iL-1][i][j] * zHU[iL-1][j];
net += bias[iL][i];
a0f(zHU[iL]+i, zdHU[iL]+i, net) ;
}
}
for(i=0;i<nOU;i++){
for(net=0,j=0;j<nHU[nHL];j++)
net += w[nHL][i][j] * zHU[nHL][j];
net += bias[nHL+1][i];
a1f(zOU+i, zdOU+i, net) ;
}
}
/*---------------------------------------------------*/
void back_propagation(
int p,
float **t,
float **zIU,
float **zHU,
float **zdHU,
float *zOU,
float *zdOU,
float ***w,
float **bias,
float ***dw,
float **dbias,
float **d,
int nIU,
int *nHU,
int nOU,
int nHL,
float Alpha,
float Beta)
{
int i,j,iL;
float sum;
for(i=0;i<nOU;i++)
d[nHL+1][i] = (t[p][i] - zOU[i]) * zdOU[i] ;
for(i=0;i<nHU[nHL];i++){
for(sum=0.0f,j=0;j<nOU;j++)
sum += d[nHL+1][j] * w[nHL][j][i];
d[nHL][i] = zdHU[nHL][i] * sum;
}
for(iL=nHL-1;iL>=1;iL--){
for(j=0;j<nHU[iL];j++){
for(sum=0.0f,i=0;i<nHU[iL+1];i++)
sum += d[iL+1][i] * w[iL][i][j];
d[iL][j] = zdHU[iL][j] * sum;
}
}
for(iL=nHL+1;iL>=1;iL--){
dbias[iL][i] = Beta*d[iL][i] + Mom1*dbias[iL][i];
bias[iL][i] += dbias[iL][i];
}
}
for(iL=nHL;iL>=1;iL--){
for(j=0;j<nHU[iL+1];j++){
dw[iL][j][i] = Mom2*dw[iL][j][i];
dw[iL][j][i] += Alpha*d[iL+1][j]*zHU[iL][i] ;
w[iL][j][i] += dw[iL][j][i];
}
}
}
for(j=0;j<nHU[1];j++){
for(i=0;i<nIU;i++){
dw[0][j][i] = Alpha*d[1][j]*zIU[p][i] + Mom2*dw[0][j][i];
+ Mom2*dw[0][j][i];
w[0][j][i] += dw[0][j][i];
}
}
}
/*---------------------------------------------------*/
The function propagation computes the forward propagation from input to

output. Calculations in each unit are performed by
( )
O lj = f U lj (9.2.1)
Σ
nl−1
U lj = wl−1
ji · Oi
l−1
+ θ lj (9.2.2)
i=1
where f () is the activation function, U lj the input to the j-th unit of the l-th layer, O lj
its output, wl−1
ji the connection weight between the j-th unit in the l-th layer and the
i-th unit in the (l−1)-th layer, and θ lj the bias of the j-th unit of the l-th layer. The
function propagation calculates sequentially from the input layer to the output
layer by using Eqs. (9.2.1) and (9.2.2).
The function back_propagation is used for the error back propagation
learning. Here, the squared error is employed as the error function. That is, the error
to be minimized is defined to be the sum of the squares of the difference between the
output of the neural network and the teacher data as follows:
1Σ
nL
( p L p )2
E= O j − Tj (9.2.3)
2 j=1
where p O Lj is the output of the j-th unit of the output layer for the p-th training
pattern, p T j corresponding teacher data, and n L the number of units in the output
layer. In the error back propagation learning, the connection weights are updated
based on the following equations.
Fig. 9.8 Four-layer

∂ E(w, θ )
Δwl−1
ji = − (9.2.4)
∂wl−1
ji
wl−1
ji = w ji + α · Δw ji
l−1 l−1
(9.2.5)
As an example, let’s check the bevior of the function back_propagation in

the case of a feedforward neural network with four layers (two hidden layers) shown
in Fig. 9.8. The number of hidden layers nHL is assumed to be 2.
First, the amount of update of the connection weight between the output and the
second hidden layers is given by
( )
∂ E ( ) ∂ f U 4j
αΔw 3ji = −α 3 = α T j − O 4j Oi3 (9.2.6)
∂w ji ∂U 4j
In the function back_propagation, Δw3ji is written as dw[2][j][i] and

computed in the function as follows:
d[3][i] = (t[p][i] - zOU[i]) * zdOU[i] ;
dw[2][j][i] = Alpha*d[3][j]*zHU[2][i] + Mom2*dw[2][j][i] ;
Except for the term for the momentum method, the code is consistent with Eq.
(9.2.6).
Next, the amount of update of the connection weight between the second and the
first hidden layers is given by
( )
( 4)
Σ
n4
( ) ∂ f U 3j
∂E 4 ∂ f Uk
αΔw 2ji = −α =α Tk − Ok wk3j Oi2 (9.2.7)
∂w 2ji k=1
∂Uk4 ∂U 3j
In the function back_propagation, Δw 2ji is written as dw[1][j][i], and

calculated as follows. Firstly, d[3][i] is calculated by the equation as
d[3][i] = (t[p][i] - zOU[i]) * zdOU[i] ;
Secondly, d[2][i] is calculated in the following for-loop as

for(sum=0.0f,j=0;j<nOU;j++) sum += d[3][j]*w[2][j][i];
d[2][i] = zdHU[2][i]*sum;
}
Finally, dw[1][j][i] is calculated as follows:

dw[1][j][i] = Alpha*d[2][j]*zHU[1][i] + Mom2*dw[1][j][i] ;
(9.2.7).
Further, the amount of update of the connection weight between the first hidden
layer and the input layer is given by
∂E
αΔw 1ji = −α
∂w 1ji
( )
( 4 ) n3 ( )
Σ
n4
( ) Σ ∂ f U 2j
4 ∂ f Uk ∂ f Ul3
=α Tk − Ok wki
3
wl2j Oi1 (9.2.8)
k=1
∂Uk4 l=1
∂Ul3 ∂U 2j
In the function back_propagation, Δw 1ji is written as dw[0][j][i] and

calculated as follows. Firstly, d[3][i] is calculated by
d[3][i] = (t[p][i] - zOU[i]) * zdOU[i] ;
Secondly, d[2][i] is calculated by the double for-loop as

for(sum=0.0f,j=0;j<nOU;j++) sum += d[3][j]*w[2][j][i];
d[2][i] = zdHU[2][i]*sum;
}
Thirdly, d[1][i] is calculated by the triple for-loop as

for(iL=1;iL>=1;iL--){ //Here it means iL=1.
for(j=0;j<nHU[1];j++){
for(sum=0.0f,i=0;i<nHU[2];i++)
sum += d[2][i] * w[1][i][j];
d[1][j] = zdHU[1][j] * sum;
}
}
Finally, dw[0][j][i] is calculated as.

dw[0][j][i] = Alpha*d[1][j]*zIU[p][i] + Mom2*dw[0][j][i];
(9.2.8).
The amounts of update of the biases are summarized as follows:
( )
∂E ( 4 ) ∂ f Ui4
= −β 4 = −β Oi − Ti
βΔθi4 (9.2.9)
∂θi ∂Ui4
( )
( 3)
Σn4
( ) ∂ f U 4j
∂ E 3 ∂ f Ui
βΔθi = −β 3 = −β
3
O j − Tj
4
w ji (9.2.10)
∂θi j=1
∂U 4j ∂Ui3
∂E
βΔθi2 = −β
∂θi2
( )
( ) ( )
)∂ f Uj Σ
4
Σ
n4
( n3
∂ f Uk3 ∂ f Ui2
= −β O 4j − Tj w 3jk wki
2
(9.2.11)
j=1
∂U 4j k=1
∂Uk3 ∂Ui2
In the function back_propagation, Δθi4 , Δθi3 , and Δθi2 are represented as

dbias[3][i], dbias[2][i] and dbias[1][i], respectively. As in the case
of the update of the connection weights, the codes are consistent with Eqs. (9.2.9)–
(9.2.11).
Finally, we look at the main body of DLneuro.c.
/* DLneuro.c */
#include "nrutil.c"
#include <math.h>
#define rnd() (drand48() * (Wmax - Wmin) + Wmin)
#define noise() ((drand48()-0.5f)*2.0f*NoiseLevel)
#define FNAMELENGTH 100
#define NHU_V 1
#define NHU_C 0
float Mom1=0.05f;
float Mom2=0.05f;
float Wmin = -0.10f ;
float Wmax = 0.10f ;
#include "DLcommon.c"
#include "DLebp.c"
/*---------------------------------------------------*/
int main(void)
{
int i,j,k,iteration_min,i1,j1,rseed,o_freq,MaxPattern,
MaxEpochs,lp_no,tp_no,nIU,nOU,*nHU,nHL,nHU0,nhflag,
*idx1;
float *zOU,**zOU_min,**zIU,**zIUor,**zHU,***w,**bias,
***dw,**dbias,***w_min,**bias_min,**dtemp,**zdHU,*zdOU,
ef1,ef2,ef2_min=1e6,NoiseLevel, Alpha,Beta,**t ;
char fname1[FNAMELENGTH];
FILE *fp;
/*---------------------------------------------------*/
scanf("%d %d %d %d %d %d %d %d %d %s %d %e %e %e",
&MaxPattern,&lp_no,&nIU,&nHU0,&nOU,&nHL,&nhflag,
&MaxEpochs,&o_freq,fname1,&rseed,&Alpha,&Beta,
&NoiseLevel);
tp_no = MaxPattern - lp_no ;
/*---------------------------------------------------*/
nHU = ivector(0,nHL+1);
if(nhflag == NHU_V){
for(i=1;i<=nHL;i++) scanf("%d",nHU+i);
}else{
for(i=1;i<=nHL;i++) nHU[i] = nHU0 ;
}
nHU[0] = nIU ;
nHU[nHL+1] = nOU ;
/*---------------------------------------------------*/
t = matrix(0,MaxPattern-1,0,nOU-1) ;
zIU = matrix(0,MaxPattern-1,0,nIU-1) ;
zIUor = matrix(0,MaxPattern-1,0,nIU-1) ;
zHU = (float **)malloc((nHL+2)*sizeof(float *));
for(i=0;i<nHL+2;i++) zHU[i] = vector(0,nHU[i]-1);
zdHU = (float **)malloc((nHL+2)*sizeof(float *));
for(i=0;i<nHL+2;i++) zdHU[i] = vector(0,nHU[i]-1);
zOU = vector(0,nOU-1) ;
zdOU = vector(0,nOU-1) ;
zOU_min = matrix(0,MaxPattern-1,0,nOU-1) ;
w = (float ***)malloc((nHL+1)*sizeof(float **));
for(i=0;i<=nHL;i++)
w[i] = matrix(0,nHU[i+1]-1,0,nHU[i]-1) ;
w_min = (float ***)malloc((nHL+1)*sizeof(float **));
for(i=0;i<=nHL;i++)
w_min[i] = matrix(0,nHU[i+1]-1,0,nHU[i]-1) ;
dw = (float ***)malloc((nHL+1)*sizeof(float **));
for(i=0;i<=nHL;i++)
dw[i] = matrix(0,nHU[i+1]-1,0,nHU[i]-1) ;
bias = (float **)malloc((nHL+2)*sizeof(float *));
for(i=0;i<=nHL+1;i++) bias[i] = vector(0,nHU[i]-1) ;
bias_min = (float **)malloc((nHL+2)*sizeof(float *));
for(i=0;i<=nHL+1;i++) bias_min[i] = vector(0,nHU[i]-1) ;
dbias = (float **)malloc((nHL+2)*sizeof(float *));
for(i=0;i<=nHL+1;i++) dbias[i] = vector(0,nHU[i]-1) ;
dtemp = (float **)malloc((nHL+2)*sizeof(float *));
for(i=0;i<nHL+2;i++) dtemp[i] = vector(0,nHU[i]-1) ;
idx1 = (int *)malloc(lp_no*sizeof(int));

for(i=0;i<lp_no;i++) idx1[i] = i ;
/*---------------------------------------------------*/
read_file(fname1,zIUor,t,nIU,nOU,MaxPattern);
srand48(rseed);
initialize(w,bias,nIU,nHU,nOU,nHL);
clear_dweight(dw,dbias,nIU,nHU,nOU,nHL);
/*---------------------------------------------------*/
for(i=0;i<=MaxEpochs;i++){
for(i1=0;i1<lp_no;i1++)
for(j=0;j<nIU;j++)
zIU[i1][j] = (1.0f + noise())*zIUor[i1][j] ;
s_shuffle(idx1,lp_no) ;
for(j=0;j<lp_no;j++){
propagation(idx1[j],zIU,zHU,zdHU,zOU,zdOU,
w,bias,nIU,nHU,nOU,nHL);
back_propagation(idx1[j],t,zIU,zHU,zdHU,zOU,zdOU,
w,bias,dw,dbias,dtemp,nIU,nHU,nOU,nHL,Alpha,Beta);
}
if(i%o_freq==0){
for(ef1=0.0f,j=0;j<lp_no;j++){
propagation(j,zIUor,zHU,zdHU,zOU,zdOU,
for(k=0;k<nOU;k++) ef1 += fabs(t[j][k] - zOU[k]);
}
for(ef2=0.0f,j=lp_no;j<lp_no+tp_no;j++){
propagation(j,zIUor,zHU,zdHU,zOU,zdOU,
for(k=0;k<nOU;k++) ef2 += fabs(t[j][k] - zOU[k]);
}
printf("%d th Error : %.5f %.5f\n",i,ef1/lp_no,ef2/tp_no);
if(ef2<=ef2_min){
ef2_min = ef2;
iteration_min = i;
store_weight(w,bias,w_min,bias_min,nIU,nHU,nOU,nHL);
}
}
}
/*---------------------------------------------------*/
show_results(w, bias, w_min, bias_min, nIU, nHU, nOU, nHL) ;
return 0 ;
}
nrutil.c, which is included at the beginning of DLneuro.c, defines a set

of functions provided in the Ref. [14]. nrutil.c is in the public domain and
can be downloaded from the web site (http://numerical.recipes). In DLneuro.c,
nrutil.c is used for array allocation.
rnd() is a macro that generates uniform random numbers in the range of
[Wmin , Wmax ] and noise() generates uniform random numbers in the range of
[−Noise Level, Noise Level]. Although drand48() is used here to generate the
random numbers in DLneuro.c, it could be replaced by a better routine.
The structure of the main function is summarized as follows:
(1) Loading of meta-parameters such as number of training patterns, number of
hidden layers, etc.
(2) Allocation of arrays based on the meta-parameters.
(3) Loading of training patterns from a file by read_file function.
(4) Initialization of the connection weights and biases by initialize function.
(5) Start of training (Training continues until MaxEpochs is met.)
(6) End of training (MaxEpochs is met.)
(7) Output of training results, e.g., all the connection weights and biases by
show_results function.
The operations per epoch in the training loop are summarized as follows:
(E1) Add of noise to input data.
(E2) Change of the order of presentation of training patterns by s_shuffle
function.
(E3) Following operations are performed per training pattern.
(E3-1) Propagation (propagation)
(E3-2) Back propagation (back_propagation)
(E3-3) Following operations are performed every o_freq epochs.
(E3-3-1) Calculate sum of errors for training patterns.
(E3-3-2) Calculate sum of errors for verification patterns.
(E3-3-3) Output average error.
(E3-3-4) If the error for verification patterns becomes minimum,
(E3-3-4-1) Copy all the connection weights and biases to another arrays.
DLneuro.c has been tested successfully on Linux. A compilation example is

shown as
$ cc –O3 –o DLneuro.exe DLneuro.c –lm
After compiling, run the program as follows:

$ echo “1000 800 5 20 3 2 0 5000 100 indata.dat 12345 0.1 0.1 0.001” |
./DLneuro.exe > result.txt
In this case, the result will be stored in the result.txt file.

In the above example, the number of units in each of hidden layers is assumed to
be 20, and the number of training epochs 5000.
9.2.2 Sample Code for Feedforward Neural Networks in C

with OpenBLAS
In the previous subsection, a program for a feedforward neural network in the C

language is demonstrated in detail. Here, an extension of the program, by using the
open-source general-purpose BLAS library, is provided.
It is known that computations in a fully connected feedforward neural network
consist of the matrix and vector operations, called linear algebraic operations and
frequently used in computational mechanics and other fields.
For this reason, a general-purpose library of basic operations in linear algebra has
been developed, called BLAS (Basic Linear Algebra Subprograms). The program
provided in this subsection uses OpenBLAS [https://www.openblas.net], an open-
source BLAS library optimized for many kinds of CPUs, where matrices are repre-
sented as one-dimensional arrays. Consider a matrix consisting of single-precision
real numbers as follows:
[ ]
a00 a01 a02
A= (9.2.12)
a10 a11 a12
The matrix A above is represented by the one-dimensional array as
float A[6] = {a00 ,a01 ,a02 ,a10 ,a11 ,a12 };
which is stored in the contiguous area of the memory in this order, i.e., components
of the first row come first and those of the second follow.
This storage order is called RowMajor, a standard for matrices in C. In the case
above, the component at the i-th row and the j-th column of the two-dimensional
matrix A is accessed as A[i*3 + j] of the corresponding one-dimensional array
in the RowMajor order by explicitly specifying the number of columns, 3.
Note that, in the FORTRAN language, the matrix in Eq. (9.2.12) is stored in
memory as the order shown by
{a00 , a10 , a01 , a11 , a02 , a12 }
which is called ColumnMajor, where components of the first, the second and the
third columns are, respectively, stored in this order.
BLAS subprograms are classified into three levels as
Level 1: Subprograms (functions in C) on vectors.
Level 2: Subprograms (functions in C) on vectors and matrices, including the

matrix–vector product.
Level 3: Subprograms (functions in C) on matrices, including the matrix–matrix
multiplication.
The computational intensity differs among the levels of BLAS, which is defined as
the number of operations divided by the amount of data (bytes) used for the operation.
Consider the operations between vectors in the level 1 BLAS. In the case of addition
of two vectors as
⎛ ⎞ ⎛ ⎞
a1 b1
⎜ a2 ⎟ ⎜ b2 ⎟
⎜ ⎟ ⎜ ⎟
⎜ . ⎟+⎜ . ⎟ (9.2.13)
⎝ .. ⎠ ⎝ .. ⎠
an bn
the computational intensity is 0.5 because n additions are performed using 2n pieces
of data.
In the case of the scalar product of vectors as
⎛⎞
b1
( )⎜ ⎟
⎜ b2 ⎟
a1 a2 · · · an ⎜ . ⎟ (9.2.14)
⎝ .. ⎠
bn
the computational intensity is 1.0 because n additions and n multiplications are

performed using 2n pieces of data.
Consider the operations in the level 2 BLAS. In the case of matrix–vector product
of a matrix and a vector as
⎡ ⎤⎛ ⎞
a11 a12 ··· a1n b1
⎢ a21 a22 ··· a2n ⎥ ⎜ b2 ⎟
⎢ ⎥⎜ ⎟
⎢ . .. .. .. ⎥⎜ .. ⎟ (9.2.15)
⎣ .. . . . ⎦⎝ . ⎠
an1 an2 · · · ann bn
the computational intensity is about 2.0 because 2n 2 operations are performed using
n 2 + n pieces of data.
For the case of matrix–matrix product, which is one of the operations in the level
3 BLAS and shown as
⎡ ⎤⎡ ⎤
a11 a12 ··· a1n b11 b12 ··· b1n
⎢ a21 a22 ··· a2n ⎥ ⎢ b21 b22 ··· b2n ⎥
⎢ ⎥⎢ ⎥
⎢ . .. .. .. ⎥⎢ .. .. .. .. ⎥ (9.2.16)
⎣ .. . . . ⎦⎣ . . . . ⎦
an1 an2 · · · ann bn1 bn2 · · · bnn
the computational intensity is n because 2n 3 operations are performed using 2n 2

pieces of data.
Note that, the higher the computational intensity, the more efficiently the
computation can be performed on most computers.
Now, let’s take a look at some examples of typical functions at each level in
OpenBLAS.
Here, we discuss a description of the level 1 function cblas_saxpy(), which
adds a constant (single-precision real number) times a vector x of single-precision
real numbers to a vector y of single-precision real numbers. Its arguments are shown
as follows:
void cblas_saxpy(int n, float a, float *x, int incx, float *y,

int incy)
The operation of this function is written without using BLAS as
for(i=0;i<n;i++){
y[incy*i] += a*x[incx*i] ;
}
Note that the dimension of vector x must be at least 1 + (n-1)*incx and that
of vector y at least 1 + (n-1)*incy. There is cblas_daxpy() that performs
the same operation as the above but for double-precision real numbers.
Next, a description of the level 2 function cblas_sgemv() is taken. This func-
tion performs the operation of multiplying the product of an m-by-n matrix A and a
vector x by alpha, and adding it to a vector y multiplied by beta. The components
of matrix A, vectors x and y, and constants alpha and beta are single-precision
real numbers. The arguments of this function are shown as
void cblas_sgemm(int order, int transA, int transB, int m,

int n,
int k, float alpha, float *A, int ldA, float *B, int 1dB,
float beta, float *C, int 1dC)
A parameter order related to the storage order of matrix A is defined in

cblas.h as follows:
enum CBLAS_ORDER {CblasRowMajor=101, CblasColMajor=102};
Therefore, CblasRowMajor should be specified for the case of the row priority,
and CblasColMajor for the case of the column priority. transA is a parameter
that specifies whether or not to transpose the matrix A, which is defined in cblas.h
as follows:
enum CBLAS_TRANSPOSE {CblasNoTrans=111, CblasTrans=112,

CblasConjTrans=113};
If a matrix is to be tranposed, CblasTrans is specified; if not,

CblasNoTrans. ldA denotes the leading dimension of matrix A. Now, consider
the behavior of a function specified as follows:
cblas_sgemv(CblasRowMajor, CblasNoTrans, m, n,
alpha, A, n,
x, incx,
beta, y, incy)
The operation of this function can be written without using BLAS as
for(i=0;i<m;i++){
for(j=0,d=0.0f;j<n;j++) d += A[i*n+j]*x[incx*j] ;
y[incy*i] = alpha*d + beta*y[incy*i] ;
}
Note that the dimension of vector x must be at least 1 + (n-1)*incx, and that
of vector y must be at least 1 + (n-1)*incy. There is also cblas_dgemv()
that performs the same operation but for double-precision real numbers.
Next, we discuss a description of the level 3 function cblas_sgemm(), which
performs the operation of multiplying by alpha the product of an m-by-k matrix A
and a k-by-n matrix B (resulting in an m-by-n matrix), and adding it to an m-by-n
matrix C multiplied by beta. The components of matrices A, B, and C, and the
constants alpha and beta are single-precision real numbers. The arguments of
this function is shown as
A parameter order specifies again how to store the matrix A. Specify

CblasRowMajor for the row priority or CblasColMajor for the column
priority. transA is a parameter specifying whether or not to transpose the matrix A.
Specify CblasTrans for transposition or CblasNoTrans for no transposition.
transB is a parameter for transposition of the matrix B, and the specification method
is the same as for transA. ldA, ldB, and ldC refer to the leading dimensions of
matrices A, B, and C, respectively.
Now, consider the behavior of a function specified as
cblas_sgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans, m, n,

k, alpha, A, k, B, n, beta, C, n)
The operation of this function in the case above can be written without using
BLAS as
for(i=0;i<m;i++){
for(j=0;j<n;j++){
for(ii=0,d=0.0f;ii<k;ii++) d += A[i*k+ii]*B[ii*n+j] ;
C[i*n+j] = alpha*d + beta*C[i*n+j] ;
}
}
OpenBLAS functions described above as well as other functions of the level 1

among the OpenBLAS given as
cblas_scopy ( )
cblas_sscal ( )
cblas_sasum ( )
are utilized in our program. The operations of each of these functions can be described
without using BLAS as follows:
/* void cblas_scopy(int n, float *x, int incx, float *y,

int incy) */
for(i=0;i<n;i++){
y[i*incy] = x[i*incx] ;
}
/* void cblas_sscal(int n, float alpha, float *x, int incx) */
for(i=0;i<n;i++){
x[i*incx] *= alpha ;
}
/* float cblas_sasum(int n, float *x, int incx) */

float sum;
for(i=0,sum=0.0f;i<n;i++){
sum += fabs(x[i*incx]) ;
}
return sum;
While DLneuro.c, described in the previous subsection, updates the connec-

tion weights by the back propagation learning for each training pattern, the program
explained here updates them after the forward propagation of multiple patterns (called
mini-batch).
Although this slightly degrades the ability of the error back propagation learning
to approximate nonlinear functions, it is often advantageous in total because the
operations can be written in the form of matrix products and the computational effi-
ciency is greatly improved. Especially for large-scale problems with a large number
of training patterns, it is usual to use mini-batches to keep the training time within a
realistic range.
Let the number of training patterns in a mini-batch be mb, and the local training
pattern number in a mini-batch denoted by a number from 1 to mb. The forward
propagation calculation for the p-th learning pattern in a mini-batch can be written
as follows:
Σ
nl−1
p
U lj = wl−1
ji · Oi
p l−1
+ θ lj (9.2.17)
i=1
where p U lj is the input to the j-th unit in the l-th layer, and p O l−1i the output of
the i-th unit in the (l−1)-th layer. Here, the left shoulder supercript p denotes that
these input and output are those in the p-th learning pattern in a mini-batch. wl−1 ji is
the connection weight between the j-th unit in the l-th layer and the i-th unit in the
(l−1)-th layer, and θ lj the bias value of the j-th unit in the l-th layer.
Using the scalar product of vectors, Eq. (9.2.17) is written for the first unit in the
l-th layer as
⎛ p l−1 ⎞
( ) O1
⎜ .. ⎟
p
U1l = w11
l−1
· · · w1,nl−1 ⎝ . ⎠ + θ1l
l−1
(9.2.18)
p l−1
Onl−1
Similarly, Eq. (9.2.17) is written for the second unit in the l-th layer as
⎛ p l−1 ⎞
( ) O1
⎜ .. ⎟
p
U2l = w21
l−1
· · · w2,nl−1 ⎝ . ⎠ + θ2l
l−1
(9.2.19)
p l−1
Onl−1
Therefore, for all the units in the l-th layer, we have

⎛ ⎞ ⎡ ⎤⎛ ⎞ ⎛ l⎞
p
U1l w11
l−1
w12
l−1
· · · w1,n
l−1
pO1l−1 θ1
⎜ ⎢
⎟ ⎢w
l−1
⎥ ⎜ ⎟ ⎜ l ⎟
⎜
p l
U2 ⎟ ⎢ 21
l−1
w l−1
· · · w l−1
2,nl−1 ⎥⎜
p
O l−1
⎜ θ2 ⎟
⎜ .. ⎟=⎢ .
22
. . ⎥⎜ .2 ⎟ ⎟ + ⎜ . ⎟
⎝ . ⎠ ⎣ .. .. . . . .. ⎥ ⎦⎝ .. ⎠ ⎝ .. ⎠
p l
Un l wnl−1
l ,1
wnl−1
l ,2
· · · wnl−1
l ,n l−1
pOnl−1
l−1
θnl l
⎛ ⎞ ⎛ l⎞
pO1l−1 θ1
[ l−1 ]⎜ ⎜
p l−1 ⎟
O 2 ⎟ ⎜ 2⎟
⎜ θl ⎟
= w ⎜ .. ⎟ + ⎜ . ⎟ (9.2.20)
⎝ . ⎠ ⎝ .. ⎠
p l−1
Onl−1 θnl l
[ ]
where wl−1 is the n l -by-n l−1 matrix with the connection weights between the
(l−1)-th layer and the l-th layer as components.
For the first training pattern in a mini-batch, Eq. (9.2.20) is written as
⎛1 ⎞ ⎛1 ⎞ ⎛ l ⎞
U1l O1l−1 θ1
⎜ Ul
1 ⎟ [ ]⎜ 1 l−1 ⎟
O ⎜ θ l ⎟
⎜ 2 ⎟ l−1 ⎜ 2 ⎟ ⎜ 2 ⎟
⎜ . ⎟= w ⎜ .. ⎟ + ⎜ . ⎟ (9.2.21)
⎝ .. ⎠ ⎝ . ⎠ ⎝ .. ⎠
1
Unl l 1
Onl−1
l−1
θnl l
Similarly, for the second training pattern in a mini-batch, Eq. (9.2.20) is written
as follows:
⎛2 ⎞ ⎛2 ⎞ ⎛ l ⎞
U1l O1l−1 θ1
⎜ 2U l ⎟ [ ]⎜ 2 O l−1 ⎟ ⎜ θ l ⎟
⎜ 2 ⎟ ⎜ 2 ⎟ ⎜ 2 ⎟
⎜ . ⎟ = wl−1 ⎜ .. ⎟ + ⎜ . ⎟ (9.2.22)
⎝ .. ⎠ ⎝ . ⎠ ⎝ .. ⎠
2
Unl l 2
Onl−1
l−1
θnl l
Finally, for all the training patterns in a mini-batch, we have

⎡1 ⎤ ⎡1 ⎤
U1l 2 U1l · · · mb
U2l O1l−1 2 O1l−1 ··· mb
O1l−1
⎢ 1U l 2U l · · · mb l
U2 ⎥ [ ⎢ 1 l−1 2 l−1
]⎢ O 2 O2 ··· mb l−1 ⎥
O2 ⎥
⎢ 2 2 ⎥
⎢ . .. . . .. ⎥ = wl−1 ⎢ .. .. .. .. ⎥
⎣ .. . . . ⎦ ⎣ . . . . ⎦
Un l Un l · · ·
1 l 2 l mb l
Un l 1 l−1 2 l−1
Onl−1 Onl−1 ··· mb
Onl−1
l−1
⎡ ⎤
θ1l θ1l · · · θ1l
⎢ θl θ2l · · · θ2l ⎥
⎢ 2 ⎥
+⎢ . .. .. ⎥ (9.2.23)
⎣ .. . ··· . ⎦
θnl l θnl l · · · θnl 1
Equation (9.2.23) can be written using matrix representation as

[ l ] [ l−1 ][ l−1 ] [ l ]
U = w O + θ (9.2.24)
[ ] [ ] [ ]
where U l and θ l are matrices of n l rows and mb columns, and O l−1 a matrix
of n l−1 rows and mb columns.
Next, the back propagation calculation using mini-batches is studied for the case
of a four-layer feedforward neural network. The update rule of the connection weight
for the p-th training pattern in a mini-batch is written as follows (see Eqs. (2.1.31),
(2.1.32) and (2.1.33) in Sect. 2.1):
( )
∂ E p ( ) ∂ f p 4
U j
p
Δw 3ji = − 3 = − p O 4j − p T j Oi = p δ 4j · p Oi3
p 3
(9.2.25)
∂w ji ∂ p U 4j
( )
( p 4)
∂ E
p Σn4
( p 4 p ) ∂ f Uk 3 ∂ f p 3
U j
p
Δw 2ji = − 2 = − Ok − Tk wk j p 2
Oi
∂w ji k=1
∂ pU 4
k ∂ pU 3
j
( )
Σn4 ∂ f p U 3j
= δk · wk3j
p 4
Oi = p δ 3j · p Oi2
p 2
(9.2.26)
k=1
∂ pU 3
j
∂pE
p
Δw 1ji = −
∂w 1ji
( )p
(p ) (p )
Σ
n4
( ) ∂ f Uk4 Σ
n3 ∂f p
U 2j
3 ∂f Ul3
=− p
Ok4 − p
Tk wkl wl2j Oi1
k=1
∂ p Uk4 l=1
∂ p Ul3 ∂ p U 2j
( )
(p )
Σ
n4 Σ
n3 ∂f p
U 2j
3 ∂f Ul3
= δk
p 4
wkl wl2j p
Oi1
k=1 l=1
∂ p Ul3 ∂ p U 2j
( )
Σ
n3 ∂f p
U 2j
= δl wl j
p 3 2 p
Oi1 = p δ 2j · p Oi1 (9.2.27)
l=1
∂ p U 2j
From Eq. (9.2.25), p δ 4j for each j is collected as

⎛ ⎞
δ1
p 4
⎜ δ2
p 4⎟
⎜ ⎟
⎜ . ⎟ (9.2.28)
⎝ .. ⎠
δn 4
p 4
[ ]
for all the training patterns in a mini-batch can be given by a matrix δ 4 of
δj
p 4
n 4 rows and mb columns as

⎡1 ⎤
δ14 δ1
2 4
··· δ1
mb 4
⎢ 1δ4 δ2
2 4
··· δ2
mb 4 ⎥ [ ]
⎢ 2 ⎥
.. ⎥ = δ
4
⎢ . .. (9.2.29)
⎣ .. . ··· . ⎦
δn 4
1 4
δn 4 · · ·
2 4
δn 4
mb 4
From Eq. (9.2.26), p δ 3j can be expressed as

( )
Σ
n4 ∂f p
U 3j
δj
p 3
= δk
p 4
· wk3j
k=1
∂ p U 3j
⎛ ⎞
( δ1
p 4
)
( )⎜ ⎟ ∂ f p U 3j
δ2
p 4
⎜ ⎟
= w13 j w23 j · · · wn34 j ⎜ . ⎟ × (9.2.30)
⎝ .. ⎠ ∂ p U 3j
δn 4
p 4
Then, from Eq. (9.2.30), p δ 3j for each j is collected as

⎛ ⎞
∂ f ( p U13 )
⎛ ⎞ ⎛⎡ ⎤⎛ ⎞⎞
δ1
p 3 w11
3
w21 3
· · · wn34 1 δ1
p 4 ⎜ ⎟ ∂ p U13
⎜ ⎟
∂ f ( p U23 )
⎜ δ2
p 3 ⎟ ⎜⎢ w 3 w 3 · · · wn34 2 ⎥⎜ δ2
p 4 ⎟⎟ ⎜ ⎟
⎜ ⎟ ⎜⎢ 12 22 ⎥⎜ ⎟⎟ ⎜ ⎟ ∂ p U23
⎜ .. ⎟ = ⎜⎢ .. .. . ⎥⎜ .. ⎟ ⎟ ʘ ⎜ .. ⎟ (9.2.31)
⎝ . ⎠ ⎝⎣ . . · · · .. ⎦⎝ . ⎠ ⎠ ⎜
⎜ (. )⎟
⎟
δn 3
p 3
w1n
3
w 3
· · · wn34 n 3 δ
p 4 ⎝ ∂ f Un ⎠
p 3
3 2n 3 n4 3
∂ p Un33
where ʘ denotes the Hadamard product of two vectors, a binary operation that
takes two vectors of the same dimension and produces another vector of the same
dimension whose i-th component is the product of the i-th components of the original
two vectors. Thus, p δ 3j for all the training patterns in a mini-batch can be writeten as
⎡1 ⎤
δ13 δ1
2 3
... δ1
mb 3
⎢ δ3
1
δ2
2 3
... δ2
mb 3 ⎥
⎢ 2 ⎥
⎢ . .. .. ⎥
⎣ .. . ... . ⎦
δn 3 δn 3 . . . mb δn33
1 3 2 3
⎡⎡ 3 ⎤⎡ 1 ⎤⎤
w11 w21 3
· · · wn34 1 δ14 δ1
2 4
··· δ1
mb 4
⎢⎢ w 3 w 3 · · · w 3 ⎥⎢ 1 δ 4 δ2
2 4
··· δ2
mb 4 ⎥⎥
⎢⎢ 12 22 n4 2 ⎥⎢ 2 ⎥⎥
= ⎢⎢ . .. . ⎥⎢ . .. .. ⎥⎥
⎣⎣ . . . · · · .. ⎦⎣ .. . ··· . ⎦⎦
w1n 3 w2n 3 · · · wn 4 n 3
3 3 3 δn 4 δn 4 · · ·
1 4 2 4
δn 4
mb 4
⎡ ⎤
∂ f (1 U13 ) ∂ f (2 U13 ) ∂ f (mb U13 )
· · ·
⎢ ∂ 1U13
1 3
∂ U1
2 3
∂ U1
mb 3
⎥
⎢ ∂ f ( U2 ) ∂ f (2 U23 ) ∂ f (mb U23 ) ⎥
⎢ ∂ 1U 3 · · · ⎥
⎢ ∂ 2 U2 3
∂ mb U2 3
⎥
ʘ⎢ ⎥
2
⎢ .
.. .
.. .
.. ⎥
(9.2.32)
⎢ ( ) ( ) · · · ( )⎥
⎣ ∂ f 1 Un3 ∂ f 2 Un3 ∂ f mb 3
U ⎦
· · · ∂ mb U 3 3
3 3 n
∂ 1U 3 n3 ∂ 2U 3 n3 n3
Equation (9.2.32) is rewritten as

⎡ ⎤
∂ f (1 U13 ) ∂ f (2 U13 ) ∂ f (mb U13 )
···
⎢ ∂ 1 U13 ∂ 2 U13 ∂ mb U13⎥
⎢ ∂ f (1 U23 ) ∂ f (2 U23 ) ⎥
∂ f (mb U23 )
[ 3] [[ 3 ]T [ 4 ]] ⎢ ··· ⎥
⎢ ∂ 1 U23 ∂ 2 U23 ∂ mb U23⎥
δ = w δ ʘ⎢ .. .. .. ⎥ (9.2.33)
⎢ ··· ( . ) ⎥
⎢ (. ) (
. ) ⎥
⎣ ∂ f 1 Un3 ∂f 2
Un33 ∂ f mb Un33 ⎦
∂ 1 Un33
3
∂ 2 Un33
· · · ∂ mb U 3
n3
[ ] [ ]
where δ 3 is a matrix of n 3 rows and mb columns, and w 3 that of n 4 rows and n 3
[ ]T
columns ( w 3 is n 3 rows and n 4 columns). Note that here ʘ denotes the Hadamard
product of two matrices, a binary operation that takes two matrices of the same dimen-
sions and produces another matrix of the same dimensions whose i, j-th component
is the product of the i, j-th components of the original two matrices.
In the same manner, from Eq. (9.2.27), we obtain
⎡1 ⎤
δ12 δ1
2 2
··· δ1
mb 2
⎢ 1δ2 δ2
2 2
··· δ2
mb 2 ⎥
⎢ 2 ⎥
⎢ . .. .. ⎥
⎣ .. . ··· . ⎦
δn 2 δn 2 · · · mb δn22
1 2 2 2
⎡⎡ 2 ⎤⎡ 1 ⎤⎤
w11 w21 2
· · · wn23 1 δ13 δ1
2 3
··· δ1
mb 3
⎢⎢ w 2 w 2 · · · w 2 ⎥⎢ 1 δ 3 δ2
2 3
··· δ2
mb 3 ⎥⎥
⎢⎢ 12 22 n3 2 ⎥⎢ 2 ⎥⎥
= ⎢⎢ . .. . ⎥⎢ . .. .. ⎥⎥
⎣⎣ .. . · · · .. ⎦⎣ .. . ··· . ⎦⎦
w1n
2
2
w2n
2
2
· · · wn23 n 2 δn 3
1 3
δn 3 · · ·
2 3
δn 3
mb 3
⎡ ⎤
∂ f (1 U12 ) ∂ f (2 U12 ) ∂ f (mb U12 )
···
⎢ ∂ 1 U12 ∂ 2 U13 ⎥
∂ mb U12
⎢ ∂ f (1 U22 ) ∂ f (2 U22 ) ⎥
∂ f (mb U22 )
⎢ ··· ⎥
⎢ ∂ 1 U22 ∂ 2 U23 ⎥
∂ mb U22
ʘ⎢ .. .. .. ⎥ (9.2.34)
⎢ ··· ( . ) ⎥
⎢ (. ) (
. ) ⎥
⎣ ∂ f 1 Un2 ∂f 2
Un22 ∂ f mb Un22 ⎦
∂ 1 Un22
2
∂ 2 Un22
· · · ∂ mb U 2
n2
This equation can be rewritten as

⎡ ⎤
∂ f (1 U12 ) ∂ f (2 U12 ) ∂ f (mb U12 )
···
⎢ ∂ 1 U12 ∂ 2 U12 ∂ mb U12⎥
⎢ ∂ f (1 U22 ) ∂ f (2 U22 ) ⎥
∂ f (mb U22 )
[ 2] [ [ 2 ]T [ 3 ]] ⎢ ··· ⎥
⎢ ∂ 1 U22 ∂ 2 U22 ∂ mb U22⎥
δ = w δ ʘ⎢ .. .. .. ⎥ (9.2.35)
⎢ ··· ( . ) ⎥
⎢ (. ) (
. ) ⎥
⎣ ∂ f 1 Un2 ∂f 2
Un22 ∂ f mb Un22 ⎦
∂ 1 Un22
2
∂ 2 Un22
· · · ∂ mb U 2
n2
[ ] [ ]
where δ 2 is a matrix of n 2 rows and mb columns, and w 2 that of n 3 rows and n 2
[ 2 ]T
columns ( w is n 2 rows and n 3 columns).
The amounts of updates of connection weights are obtained from Eqs. (9.2.25),
(9.2.26), and (9.2.27), which are summarized as follows:
Σ
mb
Δw 3ji = δj
p 4
· p Oi3 (9.2.36)
p=1
Σ
mb
Δw 2ji = δj
p 3
· p Oi2 (9.2.37)
p=1
Σ
mb
Δw 1ji = δj
p 2
· p Oi1 (9.2.38)
p=1
From Eq. (9.2.36), we obtain for all the connection weights between the output
layer and the second hidden layer the equation as
⎡ ⎤
Δw113
Δw12 3
· · · Δw1n 3
3
⎢ Δw 3 Δw 3 · · · Δw2n 3
3 ⎥
⎢ 21 22 ⎥
⎢ . .. .. ⎥
⎣ . . . ··· . ⎦
Δwn 4 1 Δwn34 2
3
· · · Δwn34 n 3
⎡1 ⎤⎡ ⎤
δ14 δ1
2 4
··· δ1
mb 4 1
O13 1
O23 · · · 1 On33
⎢ δ4
1
δ2
2 4
··· δ2
mb 4 ⎥⎢ 2 3
O1 2 3
O2 · · · 2 On33 ⎥
⎢ 2 ⎥⎢ ⎥
=⎢ . .. .. ⎥⎢ .. .. .. ⎥
⎣ .. . ··· . ⎦ ⎣ . . ··· . ⎦
δn 4
1 4
δn 4 · · ·
2 4 mb 4
δn 4 mb
O1 O2 · · · mb On33
3 mb 3
⎡1 mb 4 ⎤⎡ 1 3 2 3 ⎤T
δ14 2 δ14
··· δ1 O1 O1 · · · mb O13
⎢ δ2 δ2
1 4 2 4
··· mb 4 ⎥⎢ 1 3 2 3
δ2 ⎥⎢ O2 O2 · · · mb O23 ⎥
⎢ ⎥
=⎢ . . .. ⎥⎢ .. .. .. ⎥ (9.2.39)
⎣ .. .. · · · . ⎦ ⎣ . . ··· . ⎦
δn 4 δn 4 · · ·
1 4 2 4 mb 4
δn 4 On 3 On 3 . . . mb On33
1 3 2 3
This equation can be written as

[ ] [ ][ ]T
Δw 3 = δ 4 O 3 (9.2.40)
[ ] [ ]
[ 3Δw is a matrix of n 4 rows and n 3 columns (the same size as that of w 3 ),
3
where ]
and O that of n 3 rows and mb columns.
In the same way, from Eqs. (9.2.37) and (9.2.38), Eqs. (9.2.41) and (9.2.42) are
obtained, respectively, as follows:
[ ] [ ][ ]T
Δw 2 = δ 3 O 2 (9.2.41)
[ ] [ ][ ]T
Δw 1 = δ 2 O 1 (9.2.42)
[ ] [ 2]
where
[ ] Δw 2 is a matrix of n 3 rows and n 2 columns (the same size
[ as
] that
[ of
] w ),
Δw 1 that of n 2 rows and n 1 columns
[ ] (the same size as that of w 1
), O 2
that of
n 2 rows and mb columns, and O 1 that of n 1 rows and mb columns.
Here, we note that the update rule for the bias in the p-th training pattern in a
mini-batch can be written as follows (see Eqs. (2.1.46), (2.1.47), and (2.1.48) in
Sect. 2.1):
( )
∂pE ( p 4 p ) ∂ f p Ui4
p
Δθi4 = − = − O i − T i = p δi4 (9.2.43)
∂θi4 ∂ p Ui4
( )
( p 3)
Σn4
(p 4 p ) ∂ f p 4
U
∂ E 3 ∂f
p j Ui
p
Δθi = − 3 = −
3
O j − Tj w ji
∂θi j=1
∂ pU 4
j ∂ pU 3
i
( )
Σn 4
∂ f p Ui3
= δ j · w 3ji
p 4
= p δi3 (9.2.44)
j=1
∂ pU 3
i
∂pE
p
Δθi2 = −
∂θi2
( )
(p ) (p )
)∂ f U 4j Σ
p
Σ
n4
( n3
∂f Uk3 ∂f Ui2
=− p
O 4j − Tj
p
w 3jk wki
2
j=1
∂ p U 4j k=1 ∂ p Uk3 ∂ p Ui2
( p 3) ( )
Σn4 Σn3
3 ∂f Uk 2 ∂ f p Ui2
= δj
p 4
w jk wki
j=1 k=1
∂ p Uk3 ∂ p Ui2
( p 2)
Σn3
2 ∂f Ui
= δk · wki
p 3
= p δi2 (9.2.45)
k=1
∂ p U i
2
Therefore, the amount of update for the biases in the output layer in all the training
patterns in a mini-batch can be written as follows:
⎛ ⎞ ⎛ ⎞
Δθ14 1
Δθ14 + 2 Δθ14 + · · · + mb Δθ14
⎜ Δθ 4 ⎟ ⎜ 1
Δθ24 + 2 Δθ24 + · · · + mb Δθ24 ⎟
⎜ 2 ⎟ ⎜ ⎟
⎜ . ⎟=⎜ .. ⎟
⎝ .. ⎠ ⎝ . ⎠
Δθn44 1
Δθn44 + 2 Δθn44 + · · · + mb Δθn44
⎛ ⎞
δ1 + 2 δ14 + · · · + mb δ14
1 4
⎜ δ2 + 2 δ24 + · · · + mb δ24 ⎟
1 4
⎜ ⎟
=⎜ .. ⎟
⎝ . ⎠
δn 4
1 4
+ 2 δn44 + · · · + mb δn44
⎡1 4 ⎤⎛ ⎞ ⎛ ⎞
δ1 δ1 · · · mb δ14
2 4
1 1
⎢ 1δ4 δ2 · · · mb δ24 ⎥
2 4 ⎜ 1 ⎟ [ ]⎜ 1 ⎟
⎢ 2 ⎥⎜ ⎟ 4 ⎜ ⎟
=⎢ . .. .. ⎥⎜ .. ⎟ = δ ⎜ .. ⎟ (9.2.46)
⎣ .. . ··· . ⎦ ⎝ . ⎠ ⎝.⎠
δn 4
1 4
δn 4 · · · δn 4
2 4 mb 4
1 1
In the same way, from Eqs. (9.2.44) and (9.2.45), we have, respectively,
⎛ ⎞ ⎛ ⎞
Δθ13 1
⎜ Δθ 3 ⎟ [ ]⎜ 1 ⎟
⎜ 2 ⎟ ⎜ ⎟
⎜ . ⎟ = δ3 ⎜ . ⎟ (9.2.47)
⎝ .. ⎠ ⎝ .. ⎠
Δθn 3
3
1
⎛ ⎞ ⎛ ⎞
Δθ12 1
⎜ Δθ 2 ⎟ [ ]⎜ 1 ⎟
⎜ 2 ⎟ ⎜ ⎟
⎜ . ⎟ = δ2 ⎜ . ⎟ (9.2.48)
.
⎝ . ⎠ ⎝ .. ⎠
Δθn22 1
The program DLebpBLAS.c for the calculation of the forward and backward
propagation is shown as follows (Table 9.11 shows a list of main variables and arrays):
Table 9.11 Variables and constants in DLebpBLAS . c

uHU float, 2D array Input to the j-th unit in the i-th hidden layer
zHU float, 2D array Output from the j-th unit in the i-th hidden layer
zdHU float, 2D array First-order derivative of zHU with respect to uHU
t float, 2D array Teacher signals
w float, 2D array Connection weights
dw float, 2D array Weight update for w
bias float, 2D array Biases
dbias float, 2D array Bias update for bias
dtemp float, 2D array Temporary data used in back_propagation
ones float, 1D array 1D vector with all the components set to 1.0
nHU int, 1D array Number of units in each hidden layer,
nHU[0] = nIU, nHU[nHL + 1] = nOU
bsz int Batchsize
pbatch int Pointer to the location in a big array for the specific minibatch
/* DLebpBLAS.c */
/*---------------------------------------------------*/
void propagationBLAS(
float **uHU,
float **zHU,
float **zdHU,
float **w,
float **bias,
int *nHU,
int nHL,
int bsz)
{
int i, j, k, ia;
for(i=0;i<nHL;i++) {
bias_onesBLAS(uHU[i], bias[i], nHU[i+1], bsz);
cblas_sgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans,
nHU[i+1], bsz, nHU[i], 1, w[i], nHU[i], zHU[i],
bsz, 1, uHU[i], bsz);
for(j=0;j<nHU[i+1];j++){
for(k=0;k<bsz;k++){
ia = IDX(j, k, bsz) ;
a0f(zHU[i+1]+ia, zdHU[i+1]+ia, uHU[i][ia]);
}
}
}
i = nHL;
bias_onesBLAS(uHU[i], bias[i], nHU[i+1], bsz);
cblas_sgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans,
nHU[i+1], bsz, nHU[i], 1, w[i], nHU[i], zHU[i], bsz,
1, uHU[i], bsz);
for(k=0;k<bsz;k++){
ia = IDX(j, k, bsz) ;
a1f(zHU[i+1]+ia, zdHU[i+1]+ia, uHU[i][ia]) ;
}
}
}
/*---------------------------------------------------*/
void back_propagationBLAS(
float **uHU,
float **zHU,
float **zdHU,
float *t,
float **w,
float **bias,
float **dw,
float **dbias,
float *ones,
float **dtemp,
int *nHU,
int nHL,
float Alpha,
float Beta,
int bsz,
int pbatch)
{
int i, j, k, ia;
for(i=0;i<nHU[nHL+1];i++){
for(j=0;j<bsz;j++){
ia = IDX(i, j, bsz) ;
dtemp[nHL][ia] = t[IDX(j+pbatch, i, nHU[nHL+1])]
- zHU[nHL+1][ia];
dtemp[nHL][ia] *= zdHU[nHL+1][ia];
}
}
for(i=nHL-1;i>=0;i--){
cblas_sgemm(CblasRowMajor, CblasTrans, CblasNoTrans,
nHU[i+1], bsz, nHU[i+2], 1, w[i+1], nHU[i+1],
dtemp[i+1], bsz, 0, dtemp[i], bsz);
for(k=0;k<bsz;k++){
ia = IDX(j, k, bsz);
dtemp[i][ia] *= zdHU[i+1][ia];
}
}
}
for(i=0;i<nHL+1;i++){
cblas_sgemm(CblasRowMajor, CblasNoTrans, CblasTrans,
nHU[i+1],nHU[i], bsz, Alpha, dtemp[i], bsz,
zHU[i], bsz, Moment1, dw[i], nHU[i]);
cblas_sgemv(CblasRowMajor, CblasNoTrans, nHU[i+1],
bsz, Beta, dtemp[i], bsz, ones, 1, Moment1,
dbias[i], 1);
cblas_saxpy(nHU[i+1] * nHU[i], 1.0f, dw[i], 1, w[i], 1);
cblas_saxpy(nHU[i+1], 1.0f, dbias[i], 1, bias[i], 1);
}
}
/*---------------------------------------------------*/
In the code above, propagationBLAS() is a function that computes the

forward propagation sequentially from the input layer to the output layer. Operations
in each layer are written by three functions as
bias_ones(): Calculation of the second term on the right-hand side of Eq.
(9.2.24).
cblas_sgemm(): Calculation of the right-hand side of Eq. (9.2.24)
a0f(), a1f(): Calculation of the output and its derivative for the input value
of each unit (Eq. (9.2.24)).
Note that IDX(a,b,c) is a macro denoting a*c + b.
back_propagationBLAS() is a function that computes the backward prop-
agation, starting from the output layer to the input layer. It consists of three for
loops, the first two of which compute [δ] (dtemp[][] in the program) as shown in
Eqs. (9.2.29), (9.2.33), and (9.2.35).
In the last for-loop, calculations are performed for connection weights and biases
of each layer and between layers as
cblas_sgemm(): Calculation of [Δw] using Eqs. (9.2.40)–(9.2.42).
cblas_sgemm(): Calculation of {Δθ } using Eqs. (9.2.46)–(9.2.48).
cblas_saxpy(): Update of[w].
cblas_saxpy(): Update of{θ }.
Next, DLneuroBLAS.c, which includes the main function, is shown as follows
(A list of main variables and arrays used is given in Table 9.12):
/* DLneuroBLAS.c */
#include <cblas.h>
#include <float.h>
#include <math.h>
#include <stdio.h>
#include <stdlib.h>
Table 9.12 Variables and constants in DLneuroBLAS.c

MaxPattern int Number of patterns (Training + Test)
Wmin float All the weights are initially set in the range
Wmax float All the weights are initially set in the range
Moment1 float Coefficient in the momentum method for the weight update
nlev float Amount of noise used for data augmentation
oIUor float, 2D array Input data (without noise)
w_min float, 2D array Weights corresponding to the minimum test error
bias_min float, 2D array Biases corresponding to the minimum test error
#include <string.h>
#include <sys/time.h>
#define MaxH_UnitNo 1000
#define MaxO_UnitNo 1000
#define NHU_V 1
#define NHU_C 0
#define Moment1 0.05f
#define Wmin -0.30f
#define Wmax 0.30f
#define rnd() (drand48() * (Wmax - Wmin) + Wmin)
#define noise() ((drand48() - 0.5f) * 2.0f * nlev)
#define IDX(i, j, ld) ((ld) * (i) + (j))
#include "DLneuroBLAS_mem.c"
#include "DLcommonBLAS.c"
#include "DLebpBLAS.c"
/*---------------------------------------------------*/
int main(void){
int i, j, k, m, ia, ib, ic, iteration_min=0, rseed, o_freq,
bsz, MaxPattern,MaxEpochs, lp_no, tp_no, nIU, nOU, *nHU,
nHL, nHU0, nhflag, thread, pbatch;
char fname1[FNAMELENGTH];
float ef1, ef2, ef2_min = 1e6, nlev, Alpha, Beta, *t, *d, *oIU,
*oIUor, *ones, **uHU, **zHU,**zdHU, **w, **bias,
**dw, **dbias, **w_min, **bias_min, **dtemp;
//---(Part 1)-----------------------------------
scanf("%d", &thread);
openblas_set_num_threads(thread);
scanf("%d %d %d %d %d %d %d %d %d %d %d %s %d %e %e %e",
&MaxPattern,
&lp_no, &tp_no, &bsz, &nIU, &nHU0, &nOU, &nHL, &nhflag,
&MaxEpochs, &o_freq, fname1, &rseed, &Alpha, &Beta, &nlev);
srand48(rseed);
//---(Part 2)-----------------------------------
nHU = (int *)array1D(sizeof(int), nHL+2);
if(nhflag==NHU_V){
for(i=1;i<=nHL;i++) scanf("%d", nHU+i);
}else{
for(i=1;i<=nHL;i++) nHU[i] = nHU0;
}
nHU[0] = nIU;
nHU[nHL+1] = nOU;
//---(Part 3)-----------------------------------
t = (float *)array1D(sizeof(float), MaxPattern * nOU);
d = (float *)array1D(sizeof(float), nOU * bsz);
oIUor = (float *)array1D(sizeof(float), MaxPattern * nIU);
oIU = (float *)array1D(sizeof(float), lp_no * nIU);
uHU = array2D_u(nHL, nHU, bsz);
zHU = array2D_z(nHL, nHU, bsz);
zdHU = array2D_z(nHL, nHU, bsz);
w = array2D_w(nHL, nHU);
w_min = array2D_w(nHL, nHU);
dw = array2D_w(nHL, nHU);
bias = array2D_bias(nHL, nHU);
bias_min = array2D_bias(nHL, nHU);
dbias = array2D_bias(nHL, nHU);
dtemp = array2D_u(nHL, nHU, bsz);
ones = (float *)array1D(sizeof(float), bsz);
for(i=0;i<bsz;i++) ones[i] = 1.0;
//---(Part 4)-----------------------------------
read_fileBLAS(fname1,oIUor,t,nIU,nOU,lp_no,tp_no);
initializeBLAS(w, dw, bias, dbias, nHU, nHL);
//---(Part 5)-----------------------------------
for(i=0;i<=MaxEpochs;i++){
for(j=0;j<lp_no/bsz;j++){
pbatch = j * bsz;
batchcopy_noiseBLAS(nlev, &oIUor[0], &zHU[0][0],
pbatch, nIU, bsz);
propagationBLAS(uHU, zHU, zdHU, w, bias, nHU, nHL, bsz);
nHL, bsz);
back_propagationBLAS(uHU, zHU, zdHU, t, w, bias,
dw, dbias, ones,
dtemp, nHU, nHL, Alpha, Beta, bsz, pbatch);
}
if((j=lp_no%bsz)>=1){
pbatch = lp_no - bsz;
batchcopy_noiseBLAS(nlev, &oIUor[0], &zHU[0][0],
pbatch, nIU, bsz);

nHU, nHL, bsz);
back_propagationBLAS(uHU, zHU, zdHU, t, w, bias,
dw, dbias, ones,
dtemp, nHU, nHL, Alpha, Beta, bsz, pbatch);
}
if(i%o_freq==0 || i==MaxEpochs){
ef1 = 0.0f;
ef2 = 0.0f;
for(j=0;j<lp_no/bsz;j++){
pbatch = j * bsz;
batchcopyBLAS(&oIUor[0], &zHU[0][0], pbatch,
nIU, bsz);
propagationBLAS(uHU, zHU, zdHU, w, bias,
nHU, nHL, bsz);
for(k=0;k<bsz;k++){
for(m=0;m<nOU;m++){
ia = IDX(k, m, nOU);
ib = IDX(k+pbatch, m, nOU);
ic = IDX(m, k, bsz);
d[ia] = t[ib] - zHU[nHL+1][ic];
}
}
ef1 += cblas_sasum(nOU*bsz, &d[0], 1);
}
if((j=lp_no%bsz)>=1){
pbatch = lp_no - bsz;
batchcopyBLAS(&oIUor[0], &zHU[0][0],
pbatch, nIU, bsz);
nHU, nHL, bsz);
for(k=0;k<bsz;k++){
for(m=0;m<nOU;m++){
}
}
ef1 += cblas_sasum(nOU * j, &d[0], 1);
}
for(j=0;j<tp_no/bsz;j++){
pbatch = j * bsz + lp_no;
pbatch, nIU, bsz);
nHU, nHL, bsz);
for(k=0;k<bsz;k++){
for(m=0;m<nOU;m++){
}
}
ef2 += cblas_sasum(nOU * bsz, &d[0], 1);
}
if ((j = tp_no % bsz) >= 1) {
pbatch = lp_no + tp_no - bsz;
pbatch, nIU, bsz);
nHU, nHL, bsz);
for(k=0;k<bsz;k++){
for(m=0;m<nOU;m++){
}
}
ef2 += cblas_sasum(nOU * j, &d[0], 1);
}
printf("%d th Error : %.5e %.5e\n",
i, ef1/lp_no, ef2/tp_no);
if((ef2/tp_no)<=ef2_min){
ef2_min = ef2/tp_no;
iteration_min = i;
store_weightBLAS(w, bias, w_min, bias_min, nHU, nHL);
}
}
}
//---(Part 6)-----------------------------------
show_resultsBLAS(w, bias, w_min, bias_min, nHU, nHL);
return 0;
}
The structure of the program code above is summarized as follows:

part 1: Loading of various meta-parameters and initialization of a random number
generator.
part 2: Setting of the number of units in the hidden layers.
part 3: Memory allocation for most arrays.
part 4: Loading of training patterns and initialization of the connection weights.
part 5: Error back propagation learning using mini-batches. It includes the
handling of the case where the number of training patterns is not divisible by
the mini-batch size. The progress of training is monitored every o_freq epochs.
part 6: Output the connection weights and biases after training.
The functions used to allocate memory space for arrays in the part 3 are defined
in DLneuroBLAS_mem . c as follows:
/* DLneuroBLAS_mem.c */
/*---------------------------------------------------*/
void *array1D(size_t size, int row)
{
char *v;
v = (char *)calloc(row, size);
return v;
}
/*---------------------------------------------------*/
float **array2D(int row, int col)
{
float **a;
int i;
a = (float **)calloc(row, sizeof(float *));
a[0] = (float *)calloc(row * col, sizeof(float));
for(i=1;i<row;i++) a[i] = a[i-1] + col;
return a;
}
/*---------------------------------------------------*/
float ***array3D(int x, int y, int z)
{
float ***a;
int i, j;
a = (float ***)calloc(x, sizeof(float **));
a[0] = (float **)calloc(x * y, sizeof(float *));
a[0][0] = (float *)calloc(x * y * z, sizeof(float));
for(i=0;i<x;i++){
a[i] = a[0] + i * y;
for(j=0;j<y;j++) a[i][j] = a[0][0] + i*y*z + j*z;
}
return a;
}
/*---------------------------------------------------*/
float **array2D_bias(int nHL, const int *nHU)
{
float **a;
int i, y;
a = (float **)calloc(nHL + 1, sizeof(float *));
for(i=0,y=0;i<nHL+1;i++) y += nHU[i+1];
a[0] = (float *)calloc(y, sizeof(float));
for(i=0,y=0;i<nHL+1;i++){
a[i] = a[0] + y;
y += nHU[i+1];
}
return a;
}
/*---------------------------------------------------*/
float **array2D_w(int nHL, const int *nHU)
{
float **a;
int i, m;
a = (float **)calloc(nHL+1, sizeof(float *));
for(i=0,m=0;i< nHL+1;i++) m += nHU[i] * nHU[i+1];
a[0] = (float *)calloc(m, sizeof(float));
for(i=0,m=0;i<nHL+1;i++){
a[i] = a[0] + m;
m += nHU[i+1] * nHU[i];
}
return a;
}
/*---------------------------------------------------*/
float **array2D_u(int nHL, const int *nHU, int bsz)
{
float **a;
int i, y;
a = (float **)calloc(nHL+1, sizeof(float *));
for(i=1,y=0;i<=nHL+1;i++) y += nHU[i] * bsz;
a[0] = (float *)calloc(y, sizeof(float));
a[i] = a[0] + y;
y += nHU[i+1]*bsz;
}
return a;
}
/*---------------------------------------------------*/
float **array2D_z(int nHL, const int *nHU, int bsz)
{
float **a;
int i, y;
a = (float **)calloc(nHL + 2, sizeof(float *));
for(i=0,y=0;i<nHL+2;i++) y += nHU[i];
a[0] = (float *)calloc(y * bsz, sizeof(float));
a[i] = a[0] + y;
y += nHU[i]*bsz;
}
return a;
}
/*---------------------------------------------------*/
void free2D(float **a)
{
free(a[0]);
free(a);
}
/*---------------------------------------------------*/
void free3D(float ***a)
{
free(a[0][0]);
free(a[0]);
free(a);
}
Other functions are defined in DLcommonBLAS.c as
/* DLcommonBLAS.c */
/*---------------------------------------------------*/
void a0f(
float *a,
float *da,
float x)
{
float d;
d = (1.0f + tanhf(x))*0.5f ;
*a = d;
*da = d*(1.0f – d) ;
}
/*---------------------------------------------------*/
void a1f(
float *a,
float *da,
float x)
{
float d;
d = (1.0f + tanhf(x))*0.5f ;
*a = d;
*da = d*(1.0f – d) ;
}
/*---------------------------------------------------*/
void clear_deltaBLAS(
float **dtemp,
int nHL,
int *nHU,
int bsz)
{
int i, y;
for(i=1,y=0;i<=nHL+1;i++) y += nHU[i];
y *= bsz;
cblas_sscal(y, 0, dtemp[0], 1);
}
/*---------------------------------------------------*/
void bias_onesBLAS(
float *uHU,
float *bias,
int row,
int bsz)
{
int i, j;
for(i=0;i<row;i++)
for(j=0;j<bsz;j++) uHU[IDX(i, j, bsz)] = bias[i];
}
/*---------------------------------------------------*/
void batchcopy_noiseBLAS(
float nlev,
float *oIUor,
float *zHU,
int pbatch,
int nIU,
int bsz)
{
int i, j, ia, ib;
for(i=0;i<nIU;i++){
for(j=0;j<bsz;j++){
ia = IDX(i, j, bsz);
ib = IDX(j+pbatch, i, nIU);
zHU[ia] = (1.0f + noise()) * oIUor[ib];
}
}
}
/*---------------------------------------------------*/
void batchcopyBLAS(
float *oIUor,
float *zHU,
int pbatch,
int nIU,
int bsz)
{
int i, j;
for(i=0;i<nIU;i++)
for(j=0;j<bsz;j++)
zHU[IDX(i, j, bsz)] = oIUor[IDX(j+pbatch, i, nIU)];
}
/*---------------------------------------------------*/
void read_fileBLAS(
char *name,
float *o1,
float *t,
int nIU,
int nOU,
int lp_no,
int tp_no)
{
int i, j, k;
FILE *fp;
fp = fopen(name, "r") ;
for(i=0;i<lp_no+tp_no;i++){
fscanf(fp, "%d", &k);
for(j=0;j<nIU;j++)
fscanf(fp,"%e", &o1[IDX(i, j, nIU)]);
for(j=0;j<nOU;j++)
fscanf(fp,"%e", &t[IDX(i, j, nOU)]);
}
fclose(fp);
}
/*---------------------------------------------------*/
void initializeBLAS(
float **w,
float **dw,
float **bias,
float **dbias,
int *nHU,
int nHL)
{
int i, j, k;
for(i=0;i<nHL+1;i++)
for(k=0;k<nHU[i];k++)
w[i][IDX(j, k, nHU[i])] = rnd();
for(i=0;i<nHL+1;i++)
for(j=0;j<nHU[i+1];j++) bias[i][j] = rnd();
for(i=0,k=0;i<nHL+1;i++) k += nHU[i] * nHU[i+1];
cblas_sscal(k, 0.0f, dw[0], 1);
for(i=0,k=0;i<nHL+1;i++) k += nHU[i+1];
cblas_sscal(k, 0.0f, dbias[0], 1);
}
/*---------------------------------------------------*/
void store_weightBLAS(
float **w,
float **bias,
float **w_min,
float **bias_min,
int *nHU,
int nHL)
{
int i, k;
for(i=0,k=0;i<nHL+1;i++) k += nHU[i]*nHU[i+1];
cblas_scopy(k, w[0], 1, w_min[0], 1);
for(i=0,k=0;i<nHL+1;i++) k += nHU[i+1];
cblas_scopy(k, bias[0], 1, bias_min[0], 1);
}
/*---------------------------------------------------*/
void show_resultsBLAS(
float **w,
float **bias,
float **w_min,
float **bias_min,
int *nHU,
int nHL)
{
int i, j, iL;
printf("%5d", i);
printf(" %e", w_min[iL][IDX(j, i, nHU[iL])]);
printf("\n");
}
}
printf("%e ", bias_min[iL][j]);
printf("\n");
}
printf("%5d", i);
printf(" %e", w[iL][IDX(j, i, nHU[iL])]);
printf("\n");
}
}
for(j=0;j<nHU[iL+1];j++) printf("%e ", bias[iL][j]);
printf("\n");
}
}
/*---------------------------------------------------*/
A compilation example is shown below. Note that DLneuroBLAS . c has been

tested on Linux and the include file directory should be changed depending on user’s
environment.
$ cc -O3 -o DLneuroBLAS.exe DLneuroBLAS.c –I/usr/include/openblas
-lopenblas -lm
After compiling, run the program as follows. In this case, the result will be stored
in the result.txt file.
$ echo “1 1000 800 200 20 5 20 3 2 0 5000 100 indata.dat 12345 0.1 0.1
0.001” | ./DLneuroBLAS.exe > result.txt
In the above example, the number of threads is set to 1, the mini-batch size to 20,
the number of hidden layers to 2, the number of units in each hidden layer to 20, the
number of training epochs to 5000, and so on.

in Python Language
In Sects. 9.2.1 and 9.2.2, programs for feedforward neural networks in C have been
presented with a lot of mathematical formulas. The Python program of a feedforward
neural network given here, on the other hand, shows that it can be programmed very
concisely by making use of libraries.
Python is a relatively new programming language introduced in 1991 by Guido
van Rossum. Since many deep learning libraries are designed to be used with Python,
Python has become the indispensable language for deep learning.
There are a number of Python-based libraries for deep learning including
TensorFlow (https://www.tensorflow.org)
Keras (https://keras.io)
PyTorch (https://pytorch.org)
Chainer (https://chainer.org)
We use, here, Keras as a front-end to TensorFlow.
While the above libraries are specialized for deep learning, many other libraries
are also developed and are widely available for using Python not only for machine
learning, including deep learning, but also for general-purpose numerical computa-
tion. In the program of a feedforward neural network discussed here, we use two
libraries as follows:
NumPy (https://numpy.org).
pandas (https://pandas.pydata.org).
The former [4] is a library for matrix and vector operations that is essential in
scientific and engineering calculations, which is commonly used in most numerical
programs in Python, while the latter [11] is a library for operations commonly used
in data analysis, which supports a variety of data formats, and the program discussed
here uses this library for loading input data files.
Now, let us discuss DLneuroPython.py, a Python program for a fully
connected feedforward neural network. This program has the following features
in common with DLneuro.c in Sect. 9.2.1.
Network structure: Fully connected feedforward type.
Number of hidden layers: Arbitrary.
Number of units in a hidden layer: Arbitrary.
Activation function for hidden layers: Sigmoid function.
Activation function for the output layer: Sigmoid function.
Error function: Squared error.
Minimization method: Stochastic gradient descent method.
Note that the activation and error functions are changeable with others. In partic-
ular, for the Python program employed here, they are easily changed to others thanks
to the support of the library.
Table 9.13 shows a list of main variables and constants in DLneuroPython.py.

Many of them are also employed in the programs in Sects. 9.2.1 and 9.2.2.
The input data file is assumed to be the same as those in Sect. 9.2.1 as
1 0.11 0.31 0.12 0.70 0.25 0.52 0.20 0.91

2 0.10 0.20 0.84 0.69 0.40 0.43 0.85 0.10
3 0.32 0.55 0.21 0.03 0.51 0.18 0.15 0.34
.....
1000 0.27 0.80 0.90 0.22 0.37 0.35 0.15 0.73
The input data sample above is for the case of 5 input data (parameters) and 3
output (teacher) data. The total number of patterns, including training patterns and
verification patterns, is 1000. Each row corresponds to a pattern: the first column is a
Table 9.13 Variables and constants in DLneuroPython.py

nHU int, 1D array Number of units in each layer:
nHU[0] = nIU, nHU[nHL + 1] = nOU
nHU0 int Number of units in hidden layer (This value is not used after the
array nHU is defined.)
nhflag int nhflag = 0: The number of units in each hidden layer is set to
nHU0
nhflag = 1: The numbers of units hidden layers are provided in
the additional command line arguments
Moment float The coefficient in the momentum method for the weight update
rflag float, 1D array Input to the i-th unit in the output layer
rseed int Seed for random number generator
i_fname char File name of input data
o_fname char Directory name for results
wmin_dir char Directory name for weights corresponding to the minimum test
error
d_train float, 2D array Input data of training patterns
t_train float, 2D array Teacher signals of training patterns
d_test float, 2D array Input data of test patterns
t_test float, 2D array Teacher signals of test patterns
sequential number, columns 2–6 are the input data, and columns 7–9 the teacher data.
Both the input and the teacher data are assumed to be single-precision real values.
Now, let’s study the details of the program. For convenience, the program is
divided into nine parts.
Part 1: This is to import the libraries to be used. As discussed above, NumPy
is used for general-purpose array description, pandas for loading data files, and the
Keras for deep learning. Note that Keras is used as a front-end for TensorFlow.
Part 2: This part is to define the function to read the input data file.
pd.read_csv(), a pandas function, is used in the readfile0() function,
where sep= ’ ’ is specified because the input data file employs whitespace char-
acters as field separators. All the rows and columns in the input file are read into the
array arr using pd.read_csv(), and then the input data are stored in the array
d and the teacher data in the array t using np.array() of the NumPy library.
When reading the input file, the columns are deleted if any unnecessary sequential
numbers exist in the rflag columns at the beginning of each line of the input data.
The readfile() function, dividing the input data array d and the teacher data
array t read by the readfile0() function into d_train and t_train for
training and d_test and t_test for verification, respectively, stores them in the
arrays.
Part 3: This part is the beginning of the definition section of the main() function.
In this program, various parameters are given as command line arguments at startup.
(See Table 9.13).
Part 4: Here, the numbers of units in hidden layers are set. When nhflag is 0,
each number of units in all the hidden layers is the same, nHU0. On the other hand,
when nhflag is set to 1, the numbers must be given by the command line arguments
at startup.
Provided that the number of hidden layers is three and that of units in all the
hidden layers 10, we specify the following,
$python DLneuroPython.py 800 200 5 10 3 3 0 100 sample.dat 13721 0.1
0.1 1 wm res
If we want to set the numbers of units in hidden layers to 20, 15, and 5, respectively,
we specify the following,
$python DLneuroPython.py 800 200 5 10 3 3 1 100 sample.dat 13721 0.1
0.1 1 wm res 20 15 5
In the latter case, nHU [ ] is set as follows:

nHU[0]=5,nHU[1]=20, nHU[2]=15, nHU[3]=5, nHU[4]=3
Part 5: Using readfile() defined in Part 2, input data are divided and stored
in the input data array d_train and the teacher data array t_train for training,
and the input data array d_test and the teacher data array t_test for verification.
After that, the seed of the random number generator is set to initialize the weights.
Part 6: The configuration of a feedforward neural network is determined based
on the number of input parameters (number of units in the input layer) nIU, that of
units in the output layer nOU, that of hidden layers nHL, and that of units in each
hidden layer nHU[]. The activation function is specified as the sigmoid function
with activation = "sigmoid".
Other options for the activation function include the following functions.
’sigmoid’ sigmoid function (see Eq. (2.1.3) in Sect. 2.1).
’tanh’ tanh function (see Eq. (2.1.4) in Sect. 2.1).
’relu’ ReLU function (see Eq. (2.1.5) in Sect. 2.1).
’linear’ linear function (see Eq. (2.1.10) in Sect. 2.1).
While the sum of squared errors is selected with loss =
’mean_squared_error’ as the error function, there are other choices
for the error function as follows:
’mean_squared_error’ mean squared error.
’mean_absolute_error’ mean absolute error.
’categorical_crossentropy’ crossentropy (for classification prob-
lems).
Regarding the optimization method, stochastic gradient descent (SGD) is specified
by optimizer = sgd, but there are various high performance optimization
methods as
SGD Stochastic Gradient Descent (Sect. 2.3.1).
RMSprop RMSProp [19] (Sect. 2.3.2).
Adagrad AdaGrad [2] (Sect. 2.3.2).
Adam Adam [8] (Sect. 2.3.3).
At the end of Part 6, the shape of the neural network constructed is output by
ffnn . summary().
Part 7: The training of the neural network is completed with a single line of
history = ffnn . fit(), which sets the Checkpoint and stores the
connection weights that minimizes the verification error.
Part 8: The trained neural network is stored.
Part 9: This part specifies the condition for the main function to start running.
The full code of DLneuroPython . py is shown below.
# DLneuroPython.py
#----------(Part 1)----------
import sys
import numpy as np
import pandas as pd
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras import optimizers
from tensorflow.keras import models
from tensorflow.keras import callbacks
#----------(Part 2)----------
def readfile0(num,nIU,nOU,rflag,fname):
arr = pd.read_csv(fname,header=None,sep=’ ’,nrows=num)
d = np.array(arr.iloc[0 : num, rflag : \
rflag+nIU]) .astype (’float’)
t = np.array(arr.iloc[0 : num, rflag+nIU : \
rflag+nIU+nOU]) .astype(’float’)
return d, t
#---------------
def readfile(ntrain,ntest,nIU,nOU,rflag,fname,sep=’ ’):
d, t = readfile0(ntrain+ntest,nIU,nOU,rflag,fname)
d_train = d[0:ntrain]
d_test = d[-ntest:]
t_train = t[0:ntrain]
t_test = t[-ntest:]
return d_train,t_train,d_test,t_test
#----------(Part 3)----------
def main():
argv = sys.argv
argc = len(argv)
argn =0
lp_no = int(argv[1])
tp_no = int(argv[2])
nIU = int(argv[3])
nHU0 = int(argv[4])
nOU = int(argv[5])
nHL = int(argv[6])
nhflag = int(argv[7])
MaxEpochs = int(argv[8])
i_fname = argv[9]
rseed = int(argv[10])
Alpha = float(argv[11])
Moment = float(argv[12])
rflag = int(argv[13])
wmin_dir = argv[14]
o_fname = argv[15]
argn=16
#----------(Part 4)----------
nHU = np.array(nIU)
if nhflag == 0:
for i in range(nHL):
nHU = np.append(nHU,nHU0)
else:
for i in range(nHL):
nHU = np.append(nHU,int(argv[argn]))
argn += 1
nHU = np.append(nHU,nOU)
#----------(Part 5)----------
d_train,t_train,d_test,t_test=readfile(lp_no,tp_no,nIU,\
nOU,rflag,i_fname,sep=’ ’)
np.random.seed(rseed)
#----------(Part 6)----------
ffnn = models.Sequential()
ffnn.add(layers.Dense(units=nHU[1],activation="sigmoid", \
input_shape=(nIU,)))
if nHL > 1:
for j in range(2,nHL+1):
ffnn.add(layers.Dense(units=nHU[j], \)
activation="sigmoid"))
ffnn.add(layers.Dense(units=nOU,activation="sigmoid"))
sgd = optimizers.SGD(learning_rate=Alpha, momentum=Moment)
ffnn.compile(loss=’mean_squared_error’, optimizer=sgd)
ffnn.summary()
#----------(Part 7)----------
checkpoint_path = wmin_dir
cp = callbacks.ModelCheckpoint(checkpoint_path, \
monitor=’val_loss’,save_best_only=True, \
save_weights_only=True,verbose=1)
history = ffnn.fit(d_train, t_train, epochs=MaxEpochs,\
callbacks=[cp],validation_data=(d_test, t_test))
#----------(Part 8)----------
ffnn.save(o_fname)
#----------(Part 9)----------
if __name__ == ’__main__’:
main()
9.2.4 Sample Code for Convolutional Neural Networks

in Python Language
9.2.4.1 Sample Code for Convolutional Neural Networks Using

Tensorflow and Keras
One of the major factors that have led to the rise of deep learning is the success of
the convolutional neural networks (CNNs), which are well suited for handling data
such as images and audio.
Using Python (especially Tensorflow + Keras), a basic program of a convolutional
neural network is taken. We discuss here the MNIST (Modified National Institute of
Standards and Technology database) handwritten number identification problem.
The problem is to identify the number written as an image of handwritten numbers

from 0 to 9, where meta-parameters are set as follows:
Images: Size 28 × 28, 8-bit grayscale (256 Gy levels)
Number of training patterns: 60,000
Number of verification patterns: 10,000
Teacher data (Label): {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}
Now, a simple program for the MNIST identification problem is shown below,
where the code is divided into four parts. Note that this program is a modified version
of the Keras sample program (https://keras.io/examples/vision/mnist_convnet/).
#-----(Part 1)-----
import numpy as np
from tensorflow.keras import layers
from tensorflow.keras import models
In Part 1, the libraries to be used are imported.
#-----(Part 2)-----
nClass = 10
iShape = (28, 28, 1)
(x_train,y_train),(x_test,y_test)= \
keras.datasets.mnist.load_data()
x_train = x_train.astype("float32") / 255.0
x_test = x_test.astype("float32") / 255.0
x_train = np.expand_dims(x_train, -1)
x_test = np.expand_dims(x_test, -1)
y_train = keras.utils.to_categorical(y_train, nClass)
y_test = keras.utils.to_categorical(y_test, nClass)
In Part 2, data for training and verification are prepared. nClass is the number
of classification categories, set as nClass = 10 since the problem is to identify
handwritten numbers from 0 to 9. iShape means the format of the image. Since
the image size is 28 × 28 and the number of channels is 1 (grayscale), iShape is
set to (28, 28, 1). For RGB color image of the same size, it would be (28, 28, 3).
Keras has a dedicated function to load MNIST image data. Therefore, it
is easy to complete the data loading process by simply calling the function
keras.datasets.mnist.load_data(). With only this, x_train and
y_train will store 60,000 image data and labels (teacher data) respectively, and
x_test and y_test 10,000 image data and labels (teacher data), respectively.
Since the input image data loaded by keras.datasets.mnist.load_data(),

x_train and x_test, have integer values from 0 to 255 for each pixel, they are
converted to real values from 0.0 to 1.0, and their dimensions are expanded.
The loaded labels (teacher data), y_train and y_test, are converted to one-
hot encoding using keras.utils.to_categorical(), since the original data
are of integer values between 0 and 9. As a result, if y_train [ i] is 3 before
conversion, that will be [0,0,0,1,0,0,0,0,0,0] after conversion.
#-----(Part 3)-----
cnn_mnist = models.Sequential()
cnn_mnist.add(layers.Conv2D(32, kernel_size=(3, 3),\
activation="relu",input_shape=iShape))
cnn_mnist.add(layers.MaxPooling2D(pool_size=(2, 2)))
cnn_mnist.add(layers.Conv2D(64, kernel_size=(3, 3), \
activation="relu"))
cnn_mnist.add(layers.MaxPooling2D(pool_size=(2, 2)))
cnn_mnist.add(layers.Flatten())
cnn_mnist.add(layers.Dropout(0.5))
cnn_mnist.add(layers.Dense(units=nClass, \
activation="softmax"))
cnn_mnist.summary()
cnn_mnist.compile(loss="categorical_crossentropy",\
optimizer="adam", metrics=["accuracy"])
In Part 3, a convolutional neural network is configured. The network consists

of two convolution layers, followed by a Dropout layer [18] to improve the
generalization capability, and a fully connected layer. For the error function, the
categorical_crossentropy is used, which is suitable for classification
problems and calculates also the rate of the correct answer, accuracy.
#-----(Part 4)-----
minibatch_size = 100
epochs = 20
cnn_mnist.fit(x_train, y_train, batch_size=minibatch_size, \
epochs=epochs,\
validation_data=(x_test, y_test))
In Part 4, the training is performed. minibatch_size = 100 sets the size

of the mini-batch to 100, and epochs = 20 sets the number of training epochs to
20. By specifying validation_data = (x_test, y_test), the error and
the percentage of correct answers to the verification patterns are displayed during
training.
In the case shown above, using cnn_mnist.summary() to display the
network structure, the following results will be displayed.
Model: "sequential"
Layer (type)n Output Shape Param #
conv2d (Conv2D) (None, 26, 26, 32) 320
max_pooling2d (MaxPooling2D) (None, 13, 13, 32) 0
conv2d_1 (Conv2D) (None, 11, 11, 64) 18496
max_pooling2d_1 (MaxPooling2 (None, 5, 5, 64) 0
flatten (Flatten) (None, 1600) 0
dropout (Dropout) (None, 1600) 0
dense (Dense) (None, 10) 16010
Total params: 34,826
Trainable params: 34,826
Non-trainable params: 0
Param # for each layer is calculated as follows:
Param # = a × (b × (c × d) + e) for convolution layer (9.2.49)
Param # = a × (b + e) for dense layer (9.2.50)
where a is the number of output channels, b the number of input channels, c and d
the filter size, respectively, and e corresponds to the bias, which is usually 1.
In the above example, the number of parameters for the first conv2d layer, the
second conv2d layer, and the last dense layer are calculated, respectively, as
follows:
320 = 32 × ((3 × 3) + 1) (9.2.51)
18496 = 64 × (32 × (3 × 3) + 1) (9.2.52)
16010 = 10 × (1600 + 1) (9.2.53)
Thus, about 35,000 tunable parameters are to be learned in the CNN employed
for the above program.
Note that some well-known CNNs that have achieved top results in the ILSVRC
(ImageNet Large Scale Visual Recognition Challenge) [16] are available and easily
tested with Keras. Here, we show examples for VGG16 [17] and ResNet [5]. These
trained models can be loaded by the dedicated loading functions as follows:
model = tf.keras.applications.vgg16.VGG16(weights=’imagenet’)
model = tf.keras.applications.ResNet50(weights=’imagenet’)
If weights = ’imagenet’ is specified, the weights that have already

been trained are used, while weights = None is specified, the weights will be
initialized with random numbers and must be trained from scratch.
The results of model.summary() for each model are given as follows:
Model: "vgg16"
Layer (type) Output Shape Param #
input_1 (InputLayer) [(None, 224, 224, 3)] 0
block1_conv1 (Conv2D) (None, 224, 224, 64) 1792
block1_pool (MaxPooling2D) (None, 112, 112, 128) 0
flatten (Flatten) (None, 25088) 0
fc1 (Dense) (None, 4096) 102764544
fc2 (Dense) (None, 4096) 16781312
predictions (Dense) (None, 1000) 4097000
Total params: 138,357,544
Trainable params: 138,357,544
Non-trainable params: 0
Model: "resnet50"
Layer (type) Output Param # Connected to
Shape
input_2 (InputLayer) [(None, 0
224,
224, 3)
(continued)
(continued)
conv1_pad (ZeroPadding2D) (None, 0 input_2[0][0]
230,
230, 3)
conv1_conv (Conv2D) (None, 9472 conv1_pad[0][0]
112,
112,
64)
conv1_bn (None, 256 conv1_conv[0][0]
(BatchNormalization) 112,
112,
64)
conv1_relu (Activation) (None, 0 conv1_bn[0][0]
112,
112,
64)
pool1_pad (ZeroPadding2D) (None, 0 conv1_relu[0][0]
114,
114,
64)
pool1_pool (MaxPooling2D) (None, 0 pool1_pad[0][0]
56, 56,
64)
conv2_block1_1_conv (None, 4160 pool1_pool[0][0]
(Conv2D) 56, 56,
64)
..(many lines are (None, 1050624
omitted)...... 7, 7,
conv5_block3_3_conv 2048)
(Conv2D)
conv5_block3_2_relu[0][0]
conv5_block3_3_bn (None, 8192
(BatchNormali 7, 7,
conv5_block3_3_conv[0][0] 2048)
conv5_block3_add (Add) (None, 0 conv5_block2_out[0][0]
conv5_block3_3_bn[0][0] 7, 7,
2048)
conv5_block3_out (None, 0 conv5_block3_add[0][0]
(Activation) 7, 7,
2048)
avg_pool (None, 0 conv5_block3_out[0][0]
(GlobalAveragePooling2 2048)
predictions (Dense) (None, 2049000 avg_pool[0][0]
1000)
Total params: 25,636,712
Trainable params: 25,583,592
Non-trainable params: 53,120
Both models above require tens of millions of parameters to be trained, so that a

large amount of computational resources (e.g., GPUs) are needed to train them from
scratch. For this reason, the transfer learning is often employed, where only the final
fully connected layer is retrained with new input data with the parameters of the
convolution layers fixed as the pretrained ones.
As discussed in the above, it is not so difficult to implement a convolutional
neural network by using a library for deep learning. Once the training data for some
target problem are prepared, it is rather easy to apply deep learning to the problem.
However, even if initial results are obtained, tuning of various training conditions
such as network structure is essential to improve the rate of correct answers, which
is often very time consuming.
9.2.4.2 Sample Code for Loading Training Data for Convolutional

Neural Networks
Here is a typical program for using image data prepared by user as training data.
Let’s take the procedure for loading training data as follows:
(0) Prepare a file data.txt containing a list of image data and labels (teacher
data).
(1) Read image file names and labels (teacher data) from data.txt.
(2) Read image data using the filenames read in (1).
Below is a sample of data.txt.
1 img0001.jpg 5
2 img0002.jpg 2
3 img0003.jpg 7
...
1000 img1000.jpg 3
When using Keras, image data in various formats can be easily loaded with the
keras.preprocessing.image library.
For example, if the data.txt and all the image files listed in the file are in the
execution directory, the following program can read them. Here, the image size is
assumed to be 64 × 64. After loading image data, such operations as the conversion of
teacher data to one-hot encoding and the division of data into training and verification
data, which are explained in the previous subsubsection, should be performed.
import numpy as np
from tensorflow.keras.preprocessing.image import load_img,
img_to_array
#-------------------
num = 1000
ffname = ’data.txt’
xsize = 64
ysize = 64
#-------------------
ts = []
fname = []
with open(ffname,’r’) as f:
for line in f:
elements = line.split(‘ ‘)
fname.append(elements[1])
ts.append(elements[2].rstrip(‘\n’))
tsi = [int(s) for s in ts]
Y = np.array(tsi)
X=[]
for i in range(num):
img = img_to_array(load_img(fname[i], target_size=(xsize,
ysize)))
X.append(img)
References
1. Cottrell, J.A., Hughes, T.J.R., Bazilevs, Y.: Isogeometric Analysis. Wiley (2009)
2. Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic
optimization. J. Mach. Learn. Res. 12, 2121–2159 (2011)
3. Field, D.A.: Qualitative measures for initial meshes. Int. J. Numer. Methods Eng. 47, 887–906
(2000)
4. Harris, C.R., Millman, K.J., van der Walt, S.J., Gommers, R., Virtanen, P., Cournapeau, D.,
Wieser, E., Taylor, J., Berg, S., Smith, N.J., Kern, R., Picus, M., Hoyer, S., van Kerkwijk, M.H.,
Brett, M., Haldane, A., del Río, J.F., Wiebe, M., Peterson, P., Gérard-Marchant, P., Sheppard,
K., Reddy, T., Weckesser, W., Abbasi, H., Gohlke, C., Oliphant, T.E.: Array programming with
NumPy. Nature 585, 357–362 (2020). https://doi.org/10.1038/s41586-020-2649-2.
5. He, K., Zhang, X., Ren, S., Sun, J.: Deep Residual Learning for Image Recognition. 2016
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, 2016,
pp. 770–778, https://doi.org/10.1109/CVPR.2016.90.
6. Hughes, T.J.R., Cottrell, J.A., Bazilevs, Y.: Isogeometric Analysis: CAD, finite elements,
4135–4195 (2005)
7. Kernighan, B.W., Ritchie, D.M.: The C programming language (Second Edition). Prentice Hall
(1988)
8. Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. in the 3rd International
Conference for Learning Representations (ICLR), San Diego, 2015, arXiv:1412.6980
9. Knupp, P.M.: A method for hexahedral mesh shape optimization. Int. J. Numer. Methods Eng.
58, 319–332 (2003)
References 379
10. Knupp, P.M.: Algebraic mesh quality metrics for unstructured initial meshes. Finite Elem.
Anal. Des. 39, 217–241 (2003)
11. McKinney, W.: Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython
(2nd edition). O’Reilly (2017)
12. Piegl, L., Tiller, W.: The NURBS Book 2nd ed. Springer (2000)
13. Plauger, P.J.: The standard C library. Prentice Hall (1992)
14. Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P.: Numerical Recipes in C: The
Art of Scientific Computing (Second Edition). Cambridge University Press (1992). (http://num
erical.recipes)
15. Rogers, D.F.: An Introduction to NURBS with Historical Perspective. Academic Press (2001)
16. Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A.,
Khosla, A., Bernstein, M., Berg, A.C., Fei-Fei, L.: ImageNet Large Scale Visual Recogni-
tion Challenge. Int. J. Comput. Vis. 115, 211–252 (2015). https://doi.org/10.1007/s11263-015-
0816-y
17. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image
recognition. ICLR 2015, arXiv: 1409.1556, 2015.
18. Srivastava, N., Hinton, G.E., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: A
simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1929–1958
(2014)
19. Tieleman, T., Hinton, G.: Lecture 6.5-rmsprop: Divide the gradient by a running average of its
recent magnitude. COURSERA: Neural networks for machine learning 4(2), 26–31 (2012)
Chapter 10
Computer Programming
for a Representative Problem
Abstract In this chapter, we discuss the classification of elements discussed in

Chap. 4 with a real program implementation, where the entire process of an
application of deep learning to computational mechanics is reproduced.
10.1 Problem Definition
The problem to be solved here is the same as the one in Sect. 4.8, i.e.,
Problem Estimate the optimal number of integration points for the elemental
integration of an element from its shape features.
The assumptions of the problem are as follows:

Element type: 8-noded hexahedral elements of the first order
Numerical quadrature method: Gauss–Legendre quadrature
Number of integration points: The same number assumed for each axis
Material: Isotropic elasticity.
As a neural network to solve this problem, a fully connected feedforward neural
network is employed. The input and output data of the neural network are set as
follows:
Input data: Shape features of an element
Output data: Optimal number of integration points per axis.
Following the usual three-step procedure used in various applications in the case
study (Part II), the overall flow of the solution process can be summarized as follows
(Fig. 10.1):
https://doi.org/10.1007/978-3-031-11847-0_10
382 10 Computer Programming for a Representative Problem
Data Preparation Phase

Generation of elements
Calculation of shape features
Calculation of optimal number of integration points
Training Phase
Preprocessing of input data
Preprocessing of teacher data
Completion of training patterns
Deep learning
Application Phase
Calculation of shape features
Inference
Postprocessing
Fig. 10.1 Flowchart of three-step procedure

Generation of elements (elemgen.c)
Calculation of shape features of elements (elemshape.c)
Calculation of optimal number of integration points (elemconv.c,
elemngp.c).
Training Phase:
Preprocessing of input data (shapeNN.c)
Preprocessing of teacher data (ngpNN.c)
Completion of training patterns (patternNN.c).
Deep learning.
Application Phase:
Inference of optimal number of integration points (DAneuro.c)
Postprocessing.
10.2 Data Preparation Phase 383
The programming language used is C, and the execution environment is assumed

to be Linux.
10.2 Data Preparation Phase
First, a number of training patterns are to be generated for deep learning by the
procedure as follows:
1 Generate a large number of elements
2 Calculate some shape features of each element
3 Calculate the optimal number of integration points for each element.
In this manner, a large number of data pairs (shape features, optimal number of
integration points) are created.
10.2.1 Generation of Elements
Here, we generate elements of various shapes with standard configuration shown in

Fig. 10.2. Then, a total of 17 nodal coordinates are to be determined by adding a
random number to the corresponding node in the element except for the fixed nodal
coordinates as shown in the figure. Note that we employ here a uniform random
number of [cmin , cmax ] and, if this range is large, the possibility of generating shapes
unsuitable as elements increases.
Fig. 10.2 8-noded hexahedral element

A sample of elemgen . c is shown below.
/* elemgen.c */
#include <stdio.h>
#include <stdlib.h>
#define rnode drand48()*(cmax-cmin)+cmin
int main(void)
{
int i,j,k,nel;
double node0[8][3],node[8][3],cmin,cmax,rseed;
/*---Part 1--------*/
scanf(“%le %le %d %d”,&cmin,&cmax,&nel,&rseed);
srand48(rseed);
node[0][0] = 0.0; node[0][1] = 0.0; node[0][2] = 0.0;
node[1][0] = 1.0; node[1][1] = 0.0; node[1][2] = 0.0;
node0[2][0] = 1.0; node0[2][1] = 1.0; node0[2][2] = 0.0;
node0[3][0] = 0.0; node0[3][1] = 1.0; node0[3][2] = 0.0;
node0[4][0] = 0.0; node0[4][1] = 0.0; node0[4][2] = 1.0;
node0[5][0] = 1.0; node0[5][1] = 0.0; node0[5][2] = 1.0;
node0[6][0] = 1.0; node0[6][1] = 1.0; node0[6][2] = 1.0;
node0[7][0] = 0.0; node0[7][1] = 1.0; node0[7][2] = 1.0;
/*---Part 2--------*/
printf(“%d\n”,nel);
for(i=0;i<nel;i++){
for(j=2;j<8;j++){
for(k=0;k<3;k++) node[j][k] = node0[j][k] + rnode ;
}
node[3][2] = 0.0;
for(j=0;j<8;j++){
printf(“%d %d”,i,j);
for(k=0;k<3;k++) printf(“ %e”,node[j][k]);
printf(“\n”);
}
}
return 0;
}
In Part 1 of elemgen . c, the nodal coordinates of the basic cubic elements are
set, and in Part 2, they are modified using random numbers and the results are output.
This program is compiled by
$ cc –o elemgen.exe elemgen.c
And it is executed by
$ echo "-0.1 0.1 1000 12345" | ./elemgen.exe > elem_node.dat
In this case, the nodal coordinates of each element are stored in elem_node .
dat.
10.2.2 Calculation of Shape Parameters
Here, several shape features are calculated for each generated element. Selected
features are
A The maximum and the minimum values of the lengths of edges
B The maximum and the minimum values of the angles between edges
C The maximum and the minimum values of the angles between faces
D AlgebraicShapeMetric.
The detail of each parameter above is given in Sect. 9.1.2. Features A, B, and C are
calculated using the program ElementShape . c (Sect. 9.1.2), and the feature D
using ElementShapeMetric . c (Sect. 9.1.2). A sample program elemshape
. c is shown as
/* elemshape.c */
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include “ElementShapeMetric.c” //Section 9.1.2
#include “ElementShape.c” //Section 9.1.2
int main(void)
{
int i,j,k,ia,ib,nel,nfpn=3,nnpe=8,elem[8]={0,1,2,3,4,5,6,7};
double eshape[7],**node;
scanf(“%d”,&nel);
printf(“%d\n”,nel);
node = (double **)malloc(nnpe*sizeof(double *));
for(i=0;i<nnpe;i++) node[i] = (double *)malloc(nfpn*sizeof
(double));
for(i=0;i<nel;i++){
for(j=0;j<nnpe;j++){
scanf(“%d %d”,&ia,&ib);
for(k=0;k<nfpn;k++) scanf(“%le”,node[j]+k);
}
check_shape(eshape,elem,node,nfpn);
eshape[6] = shape_metric(elem,node,nnpe,nfpn);
printf(“%d”,i);
for(j=0;j<7;j++) printf(“ %e”,eshape[j]);
printf(“\n”);
}
return 0;
}
This program performs a series of operations as many times as the number of

elements; for each element it reads the nodal coordinates of the element, calculates
the shape features, and output them. This program is compiled as follows:
$ cc –o elemshape.exe elemshape.c -lm
And it is executed with elem_node . dat generated in Sect. 10.2.1 as input as

follows:
$ ./elemshape.exe < elem_node.dat > elem_shape.dat
In this case, the results are stored in elem_shape . dat.
10.2.3 Calculation of Optimal Numbers of Quadrature Points
Then, evaluating the convergence of the elemental integral for each generated
element, the optimal number of integration points is achieved. The procedure is
as follows:
(1) Read nodal coordinates of an element
(2) Calculate the element stiffness matrix esm0 with the number of integration
points per axis qmax being 30
(3) Set q = 2
(4) Calculate the element stiffness matrix esm with q integration points per axis
(5) Calculate and record the difference between esm0 and esm based on Error,
defined as Eq. (4.3.2) in Sect. 4.3
(6) Set q = q + 1
(7) If q = 30, go to (1) to evaluate the next element; if q < 30, go to (4).
To calculate an element stiffness matrix in (2) and (4), the function esm3D08()
(Sect. 9.1.1) is used. A sample program for this process, elemconv . c, is shown
as
/* elemconv.c */
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include "esm3D08.c" //Section 9.1.1
/*----------------------------------------*/
void gausslg(
double *weight,
double *pos,
int ngauss)
{
int i,j,k,ia,ib,ic;
double w,a,ww,v,p,q,r,d0,d1,d2,p1,p2,u,df;
ia = ngauss ;
ib = ngauss/2 ;
w = 1.0 * ngauss ;
a = 3.1415926535897932/(w+w) ;
ww = w*(w + 1.0)*0.5 ;
for(k=1;k<=ib;k++){
v = cos(a*(2*k-1)) ;
Loop1: ;
p = 1.0 ;
q=v;
for(j=2;j<=ngauss;j++){
r = ((2*j-1)*v*q - (j-1)*p)/j ;
p=q;
q=r;
}
u = (1.0 - v)*(1.0 + v) ;
d0 = (p - v*q)*w/u ;
d1 = (v*d0 - ww*q)/u ;
d2 = (q*d1/(d0*d0)+1.0)*q/d0 ;
v -= d2 ;
if(fabs(d2) >= 1.0e-16) goto Loop1 ;
df = d2*v/u ;
weight[k-1] = 2.0/(w*d0*(1.0 - df*w)*p*(1.0 - df*2.0)) ;
pos[k-1] = v ;
}
if(ib*2 < ngauss){
d0 = 1.0 ;
for(j=1;j<=ib;j++) d0 = (1.0 + 0.5/j)*d0 ;
weight[ib] = 2.0/(d0*d0) ;
pos[ib] = 0.0 ;
}
for(i=0;i<ib;i++){
weight[ngauss-1-i] = weight[i] ;
pos[ngauss-1-i] = pos[i] ;
pos[i] *= -1.0 ;
}
}
/*---------------------------------------------*/
double set_refdata(
double **esm,
int edim)
{
int i,j ;
double d1 ;
d1 = esm[0][0] ;
for(i=0;i<edim;i++){
for(j=0;j<edim;j++){
if(esm[i][j] > d1) d1 = esm[i][j] ;
}
}
return d1 ;
}
/*---------------------------------------------*/
double check_esm(
double **esm,
double **esm0,
double ref_value,
int edim)
{
int i,j,k ;
double sum ;
for(i=0,sum=0.0;i<edim;i++){
for(j=0;j<edim;j++) sum += fabs(esm[i][j] - esm0[i][j]) ;
}
return sum/ref_value ;
}
/*------------------------------------------------*/
int main()
{
int i,j,k,ia,ib,ig,nel,nfpn=3,elem[8]={0,1,2,3,4,5,6,7};
double **esm,**esm0,*gc,*gw,**node,mate[2]={2.0e11,0.3};
int max_ngp=30,nnpe=8,edim=24;
double ref_value,chk_data;
node = (double **)malloc(nnpe*sizeof(double *));
for(i=0;i<nnpe;i++) node[i] = (double *)malloc(nfpn*sizeof
(double));
esm = (double **)malloc(edim*sizeof(double *));
for(i=0;i<edim;i++)esm[i] = (double *)malloc(edim*sizeof
(double));
esm0 = (double **)malloc(edim*sizeof(double *));
for(i=0;i<edim;i++)esm0[i] = (double *)malloc(edim*sizeof
(double));
gc = (double *)malloc(max_ngp*sizeof(double)) ;
gw = (double *)malloc(max_ngp*sizeof(double)) ;
scanf("%d",&nel);
printf("%d\n",nel);
for(i=0;i<nel;i++){
for(j=0;j<8;j++){
scanf("%d %d",&ia,&ib);
for(k=0;k<nfpn;k++) scanf("%le",node[j]+k);
}
ig = max_ngp ;
gausslg(gw,gc,ig) ;
esm3D08(elem,node,mate,esm0,ig,gc,gw,nfpn);
ref_value = set_refdata(esm0,edim) ;
for(ig=2;ig<max_ngp;ig++){
gausslg(gw,gc,ig) ;
esm3D08(elem,node,mate,esm,ig,gc,gw,nfpn);
chk_data = check_esm(esm,esm0,ref_value,edim) ;
printf("%d %d %e\n",i,ig,chk_data) ;
}
}
}
In elemconv . c above, gausslg() is a function that calculates the coordi-

nates and weights of the integration points of the Gauss–Legendre quadrature [2].
When the number of integration points is input to gausslg(), the coordinates and
weights of the integration points are calculated and stored in arrays. (For the deriva-
tion of the parameters of the Gauss–Legendre quadrature, see, e.g., Ref. [1, 2].) The
function set_refdata() calculates the denominator of Eq. (4.3.2) in Sect. 4.3,
and the function check_esm() the Error defined in Eq. (4.3.2).
This program is compiled as follows:
$ cc –O3 –o elemconv.exe elemconv.c -lm
where runtime optimization (speedup) flag –O3 is specified because of high

computational intensity.
With elem_node . dat generated in Sect. 10.2.1 as input, the above is executed
by
$ ./elemconv.exe < elem_node.dat > elem_conv.dat
In this case, the results are stored in elem_conv . dat.

Then, from the data of convergence of the element matrix calculation recorded
in the elem_conv . dat file, the optimal number of integration points or the
minimum number of integration points, which results in less than the predetermined
error (in this case, 1.0 × 10−7 ), is calculated. A sample program for this process,
elemngp . c, is shown as
/* elemngp.c */
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
int i,j,ia,ib,nel;
double er[30],th=1.0e-7;
scanf("%d",&nel);
printf("%d\n",nel);
for(i=0;i<nel;i++){
for(j=2;j<30;j++) scanf("%d %d %le",&ia,&ib,er+j);
for(j=2;j<30;j++)if(er[j]<th) break;
printf("%d %d\n",i,j);
}
return 0;
}

$ cc –o elemngp.exe elemngp.c
And it is executed with elem_conv . dat as input by

$ ./elemngp.exe < elem_conv.dat > elem_ngp.dat
In this case, the results, optimal numbers of integration points for each element,
are stored in elem_ngp . dat.
With the processes above, we collect the data for each of nel elements as
Shape features of the element: e lem_shape.dat
Optimal number of integral points of the element: e lem_ngp.dat
10.3 Training Phase
Now, a feedforward neural network is trained using the shape features of the elements
collected in Sect. 10.2, elem_shape . dat, as the input data and the optimal
number of integration points of the elements, elem_ngp . dat, as the teacher
data, respectively. DLneuro . c, described in Sect. 9.2.1, is used to construct the
feedforward neural network for this problem.
As a preprocessing for training patterns for the neural network, the following data
conversion is performed. Since the size and range of the shape features are different
depending on parameters, we first transform all the parameters so that each of them
falls within the range of [0.0, 1.0].
Next, for the teacher data (output data), the number of units in the output layer is
set as equal to that of categories to use the one-hot encoding, where only one unit
outputs 1 while the other units 0. Thus, the procedure to create training patterns for
the feedforward neural network is written as follows:
(1) Conversion of the input data to [0.0, 1.0].
(2) Conversion of the teacher data to the one-hot encoding.
(3) Integration to training patterns (see Sect. 9.2.1 for the training pattern format
for DLneuro.c).
Here, a sample program for 0–1 conversion, shapeNN . c, is given as follows:
/* shapeNN.c */
#include <stdio.h>
#include <stdlib.h>
10.3 Training Phase 391
int main(void)
{
int i,j,k,ia,ib,nel;
double **shape,smin[7],smax[7],swidth[7];
FILE *fp;
/*---Part 1-----------*/
fp = fopen("elem_shape.dat","r");
fscanf(fp, "%d",&nel);
shape = (double **)malloc(nel*sizeof(double *));
for(i=0;i<nel;i++) shape[i] = (double *)malloc(7*sizeof
(double));
for(i=0;i<nel;i++){
fscanf(fp,"%d",&ia);
for(j=0;j<7;j++) fscanf(fp,"%le",shape[i]+j);
}
fclose(fp);
/*---Part 2------------*/
for(i=0;i<7;i++) smax[i] = -1.0e30;
for(i=0;i<7;i++) smin[i] = 1.0e30;
for(i=0;i<nel;i++){
for(j=0;j<7;j++){
if(shape[i][j] > smax[j]) smax[j] = shape[i][j] ;
if(shape[i][j] < smin[j]) smin[j] = shape[i][j] ;
}
}
for(j=0;j<7;j++) swidth[j] = smax[j] - smin[j] ;
/*---Part 3------------*/
for(i=0;i<nel;i++){
for(j=0;j<7;j++) shape[i][j] = (shape[i][j] –
smin[j])/swidth[j] ;
}
/*---Part 4------------*/
printf("%d\n",nel);
for(i=0;i<nel;i++){
printf("%d",i);
for(j=0;j<7;j++) printf(" %e",shape[i][j]);
printf("\n");
}
for(j=0;j<7;j++) printf("%d %e %e\n",j,smin[j],smax[j]);
return 0;
}
In the code above, Part 1 is a data reading section, Part 2 calculates the maximum
and minimum values of each parameter, Part 3 converts each parameter to 0–1, and
Part 4 writes the converted data and the maximum and minimum values of each
parameter. Note that the maximum and minimum values of each parameter obtained
here are required later in the Application Phase.

$ cc –o shapeNN.exe shapeNN.c
And it is executed as
$ ./shapeNN.exe > elem_shapeNN.dat
In this case, the converted data are stored in elem_shapeNN . dat.

Next, a sample program for one-hot encoding, ngpNN . c, is given as follows:
/* ngpNN.c */
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
int i,j,k,ia,ib,nel,*ngp,**t,gmin,gmax,gcat;
FILE *fp;
/*---Part 1-----------*/
fp = fopen("elem_ngp.dat","r");
fscanf(fp,"%d",&nel);
ngp = (int *)malloc(nel*sizeof(int));
for(i=0;i<nel;i++) fscanf(fp,"%d %d",&ia,ngp+i);
fclose(fp);
/*---Part 2------------*/
gmin = 1000; gmax = -1000 ;
for(i=0;i<nel;i++){
if(ngp[i] > gmax) gmax = ngp[i] ;
if(ngp[i] < gmin) gmin = ngp[i] ;
}
gcat = gmax - gmin + 1 ;
/*---Part 3------------*/
t = (int **)malloc(nel*sizeof(int *));
for(i=0;i<nel;i++) t[i] = (int *)malloc(gcat*sizeof(int));
for(i=0;i<nel;i++){
for(j=0;j<gcat;j++) t[i][j] = 0 ;
}
for(i=0;i<nel;i++) t[i][ngp[i]-gmin] = 1 ;
/*---Part 4------------*/
printf("%d\n",nel);
for(i=0;i<nel;i++){
printf("%d",i);
for(j=0;j<gcat;j++) printf(" %d",t[i][j]);
printf("\n");
}
printf("%d %d\n",gmin,gmax);
return 0;
}
10.3 Training Phase 393
In the code above, Part 1 is the data loading part, Part 2 calculates the maximum
and minimum optimal number of integration points and determines the number of
categories (number of output units) gcat, Part 3 converts the data to teacher data
by one-hot encoding, and Part 4 writes the converted teacher data.
$ cc –o ngpNN.exe ngpNN.c
And it is executed as
$ ./ngpNN.exe > elem_ngpNN.dat
In this case, the converted data are stored in elem_ngpNN . dat.

Now, the preprocessing of the input data and the teacher data has been done. The
training patterns can be completed by integrating these two files, elem_shapeNN
. dat and elem_ngpNN . dat, and a sample program to integrate these two files,
patternNN . c, is given as follows:
/* patternNN.c */
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
int i,j,k,ia,ib,nel,n_in,n_out;
float f1,f2,f3;
FILE *fp1,*fp2;
scanf("%d %d",&n_in,&n_out);
/*---Part 1-----------*/
fp1 = fopen("elem_shapeNN.dat","r");
fscanf(fp1,"%d",&nel);
fp2 = fopen("elem_ngpNN.dat","r");
fscanf(fp2,"%d",&nel);
/*---Part 2------------*/
for(i=0;i<nel;i++){
printf("%d",i);
fscanf(fp1,"%d",&ia);
for(j=0;j<n_in;j++){
fscanf(fp1,"%e",&f1);
printf(" %e",f1);
}
fscanf(fp2,"%d",&ia);
for(j=0;j<n_out;j++){
fscanf(fp2,"%d",&ib);
printf(" %d",ib);
}
printf("\n");
}
/*---Part 3------------*/
fclose(fp1);
fclose(fp2);
return 0;
}
where Part 1 opens both the input and teacher data files for reading, Part 2 reads and
writes a training pattern one by one, and Part 3 closes the files.
$ cc –o patternNN.exe patternNN.c
If the number of input parameters is 7 and that of categories of optimal number

of integration points is 5, it is executed as
$ echo “7 5” | ./patternNN.exe > patternNN.dat
In this case, the complete training data are stored in patternNN . dat.
So far, the training patterns for the feedforward neural network are prepared. Now,
let us start deep learning with DLneuro . c discussed in Sect. 9.2.1.
DLneuro . c is compiled as follows:
$ cc –O3 –o DLneuro.exe DLneuro.c –lm
DLneuro . exe is executed and its results are stored in result . dat as
follows:
$ echo “1000 800 5 20 3 2 0 5000 100 patternNN.dat 12345 0.1 0.1 0.001” |
./DLneuro.exe > result.dat
In the above example, it is assumed that there are two hidden layers, the number
of units in each hidden layer is 20, and the number of training epochs is 5000. Among
the 1000 training patterns, 800 patterns are used for training and 200 patterns for
verification (training monitoring). (See Sect. 9.2.1 for details of DLneuro . c.)
Note that many trials are needed to find the conditions that give the best results
among various settings of meta-parameters such as the number of hidden layers, that
of units in each hidden layer, and the learning coefficients,
It is also known that the initial settings of the connection weights and biases affect
the results; therefore it is necessary to try multiple random sequences to initialize the
connection weights and biases.
It is concluded that the best feedforward neural network for a problem should
be determined by finding the best combination of network structure and training
conditions from the results of many calculations.
10.4 Application Phase
A simple neural network program, DAneuro . c, is prepared for the Application

Phase, which does not have a training function but uses the connection weights and
10.4 Application Phase 395
biases of the neural network trained in the Training Phase. The program of DAneuro
. c is shown as
/* DAneuro.c */
#include "nrutil.c"
#include <math.h>
#define NHU_V 1
#define NHU_C 0
#define Mom1 0.1
#define Mom2 0.1
#include "DAcommon.c"
#include "DLebp.c" //Section 9.2.1
/*----------------------------------------*/
int main(void)
{
int i,j,k,i1,j1,rseed,MaxPattern,nIU,nOU,*nHU,nHL,nHU0,
nhflag;
float *zOU,**zIU,**zHU,***w,**bias,**zdHU,*zdOU;
char fname1[FNAMELENGTH],fname2[FNAMELENGTH];
FILE *fp;
/*------------------------------------*/
scanf("%d %d %d %d %d %d %s %s",
&MaxPattern,&nIU,&nHU0,&nOU,&nHL,&nhflag,fname1,fname2);
/*----------------------------------*/
nHU = ivector(0,nHL+1);
if(nhflag == NHU_V){
for(i=1;i<=nHL;i++) scanf("%d",nHU+i);
}else{
for(i=1;i<=nHL;i++) nHU[i] = nHU0 ;
}
nHU[0] = nIU ;
nHU[nHL+1] = nOU ;
/*-----------------------------------------*/
zIU = matrix(0,MaxPattern-1,0,nIU-1) ;
zHU = (float **)malloc((nHL+2)*sizeof(float *));
for(i=0;i<nHL+2;i++) zHU[i] = vector(0,nHU[i]-1);
zdHU = (float **)malloc((nHL+2)*sizeof(float *));
for(i=0;i<nHL+2;i++) zdHU[i] = vector(0,nHU[i]-1);
zOU = vector(0,nOU-1) ;
zdOU = vector(0,nOU-1) ;
w = (float ***)malloc((nHL+1)*sizeof(float **));
for(i=0;i<=nHL;i++) w[i] = matrix(0,nHU[i+1]-1,0,nHU[i]-
1) ;
bias = (float **)malloc((nHL+2)*sizeof(float *));
for(i=0;i<=nHL+1;i++) bias[i] = vector(0,nHU[i]-1) ;
/*------------------------------------*/
read_fileA(fname1,zIU,nIU,MaxPattern);
load_weight(fname2,w,bias,nIU,nHU,nOU,nHL);
/*----------------------------------*/
for(i=0;i<MaxPattern;i++){
propagation(i,zIU,zHU,zdHU,zOU,zdOU,w,bias,nIU,nHU,
nOU,nHL);
printf("%d",i);
for(j=0;j<nOU;j++) printf(" %e",zOU[j]);
printf("\n");
}
return 0 ;
}
In the code above, read_fileA() is a function to read the shape parameters

of a new element, and load_weight() that to load the connection weights and
biases of the trained neural network created in the Training Phase. Both functions
are defined in DAcommon . c given as follows:
/* DAcommon.c */
/*--------------------------------------------*/
void a0f(
float *fv,
float *fvd,
float x)
{
float dd;
*fv = dd;
*fvd = dd*(1.0 - dd) ;
}
/*--------------------------------------------*/
void a1f(
float *fv,
float *fvd,
float x)
{
float dd;
*fv = dd;
*fvd = dd*(1.0 - dd) ;
}
/*-------------------------------------------*/
void read_fileA(
char *name,
float **o,
int nIU,
int npattern)
{
int i,j,k;
FILE *fp;
10.4 Application Phase 397
fp = fopen( name, "r" ) ;

for(i=0;i<npattern;i++){
fscanf(fp,"%d",&k);
for(j=0;j<nIU;j++) fscanf(fp,"%e",o[i]+j);
}
fclose( fp );
}
/*--------------------------------------*/
void load_weight(
char *fname,
float ***w,
float **bias,
int nIU,
int *nHU,
int nOU,
int nHL)
{
int i,j,k,iL;
FILE *fp;
fp = fopen(fname,"r");
fscanf(fp,"%d",&k);
for(j=0;j<nHU[iL+1];j++) fscanf(fp,"%e",w[iL][j]+i);
}
}
for(j=0;j<nHU[iL];j++) fscanf(fp,"%e",bias[iL]+j);
}
fclose(fp);
}
/*---------------------------------*/
DAneuro . c is compiled as follows:

$ cc –O3 –o DAneuro.exe DAneuro.c -lm
DAneuro . exe is executed and its results are stored in rngp . dat as follows:
$ echo “10 7 80 5 3 0 NewElem.dat Weights.dat” | ./DAneuro.exe >
rngp.dat
This example assumes that the neural network trained in the Training Phase has
three hidden layers and 80 units per hidden layer, that the connection weights and
biases are stored in Weights . dat, that there are ten new elements for which the
optimal number of integration points is to be estimated, and that the shape parameters
of the elements are stored in NewElem . dat.
Note that Weights . dat is a file that contains only the weights and biases from
the results of DLneuro . exe in the Training Phase.
The procedure for estimating the optimal number of integration points for a new
element in the Application Phase is summarized as follows:
(1) Calculate the shape features of the new element for which the optimal number
of integration points is to be estimated.
(2) If the shape features calculated in (1) are within the range of the maximum
and minimum values output by shapeNN.exe (Sect. 10.3), it is judged that
estimation is possible and we proceed to the next step.
(3) For the elements judged to be estimable in (2), convert the shape features
to the range of [0, 1] using the maximum and minimum values output by
shapeNN.exe (Sect. 10.3).
(4) The converted shape features calculated in (3) are input to DAneuro.exe to
estimate the optimal number of integration points.
DAneuro . exe in (4) above has already been detailed in this section. As for
(1), elemshape . c (Sect. 10.2.2) can be used to calculate the shape features. In
(2), if the calculated features of an element are within the range of the maximum
and minimum values output by shapeNN . exe (Sect. 10.3), the element can be
estimated by DAneuro . exe; otherwise, the element is not to be estimated.
In the postprocessing, the estimation of the optimal number of integration points
based on the output of DAneuro . exe is performed using a criterion such as
“the category with the largest output value is equivalent to the optimal number of
integration points.”
References
1. Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P.: Numerical Recipes in C: The Art
of Scientific Computing (Second Edition). Cambridge University Press (1992). (http://numeri
cal.recipes)
2. Watanabe, T., Ohuchi, A.: Gauss-Legendre Quadrature Formula of Very High Order. J. Plasma
Fusion Res. (Kakuyuugoukenkyuu) 64(5), 397–407 (1990). https://doi.org/10.1585/jspf1958.
64.397 (in Japanese)
Index
A BLAS, 340
Activation function, 49, 330 Bounding box, 168
AdaGrad, 66 B-spline, 171, 316
Adam, 67
Adaptive finite element method, 154
Adjugate matrix, 295 C
Adversarial example, 25 C, 285
AlexNet, 21 CAD, 171
AlgebraicShapeMetric, 131, 300, 385 Carbon Fiber Reinforced Plastic
AlphaGO, 22 Composites (CFRP), 268
AlphaGO Zero, 22 Cblas_sasum(), 344
A posteriori error estimation, 152, 154, 155 Cblas_saxpy(), 342
Artificial compressibility, 220 Cblas_scopy(), 344
Autoencoder, 17, 264 Cblas_sgemm(), 343
Automatic differentiation, 39, 73 Cblas_sgemv(), 342
Average pooling, 64 Cblas_sscal(), 344
Central difference approximation, 212, 278
C0 continuity, 170
B C1 continuity, 174
Backpropagation, 14, 333 Chain rule of differentiation, 296
Back Propagation Through Time (BPTT), Classification problem, 12
226 Coarse mesh, 154
Backward difference approximation, 212 Coefficient matrix, 143
Backward substitution, 143 ColumnMajor, 340
Banded structure, 146 Compressed Column Storage (CCS), 148
Basis function, 105, 106, 148, 169, 171, Computational complexity, 145
180, 241, 242, 270, 291, 316 Computational graph, 40
Batch normalization, 30 Computational intensity, 341
B-bar method, 249 Computation time, 141
Bezier segment, 185 Conditional Generative Adversarial
Bezier surface segment, 188, 190, 194 Network (CGAN), 34
BFloat16 (BF16), 23 Conditional variational autoencoder, 263
Bias, 50 Condition number, 131, 302
© The Editor(s) (if applicable) and The Author(s), under exclusive license 399
to Springer Nature Switzerland AG 2023
https://doi.org/10.1007/978-3-031-11847-0
400 Index
Condorcet’s theorem, 28 Experiment-based solution method, 3

Connection weight, 50 Explicit dynamics, 168
Conservation of energy, 201 Exponential linear unit, 249
Conservation of mass, 201 Exponent part, 23
Conservation of momentum, 201
Constitutive equation, 75, 201
Contact analysis, 75, 167 F
Contact search, 167 FEA-Convolution, 259, 260
Control points, 180 FEA-Net, 253, 260
Convolutional neural network, 19, 60, 371 Feedforward neural network, 49, 326
Courant-Friedrichs-Lewy (CFL) condition, Field Programmable Gate Array (FPGA),
168 24
Compressed Row Storage (CRS), 148 Fine mesh, 154
CUDA, 19 Finite difference method, 153, 210
Curve segment, 185 Finite Element Method (FEM), 4, 75, 103,
Cuthill-McKee (CM) method, 146 107, 139, 141, 167, 188, 241, 253,
CycleGAN, 36 285, 288
Floating-point Operations Per Second
(FLOPS), 16
D Fixed-point real number format, 24
Data-based solution method, 8 Floating-point real number, 23
Dataset augmentation, 27 Forget gate, 231
DCGAN, 36 FORTRAN, 340
Decision tree, 28 Forward difference approximation, 212
Decoder, 37 Forward elimination, 143
Defect identification, 76 Forward mode, 43
Dense matrix, 145 Forward propagation, 333
Derive, 40 Fourier’s law, 208
Determinant of a matrix, 295 FP16, 23
Diagonal matrix, 260 FP32, 23
Differentiability class, 148 FP64, 23
Direct method, 141 Frobenius norm, 302
Direct problem, 6, 69, 76
Dirichret boundary condition, 4
DiscretizationNet, 263 G
Discriminator, 31 Gaussian elimination, 143
Dissipation energy, 208 Gauss-Legendre quadrature, 95, 98, 286,
Domain adaptation, 79 389
Double-precision real number, 23 Gauss-Seidel method, 144
DropConnect, 28 Generative Adversarial Network (GAN), 31
Dropout, 28, 373 Generator, 31
Genetic Algorithm (GA), 77, 117
Global node number, 288
E Global search, 168
Edge computing, 22 Global stiffness matrix, 105, 139
Element node number, 288 GoogLeNet, 21
Element stiffness matrix, 105, 139, 245, 286 GPU, 19, 79
Encoder, 37
Ensemble learning, 28
Equation-based solution method, 5 H
Equation of continuity, 202 Hadamard product, 61, 348
Equation of motion, 203 H-adaptive method, 154
Euler’s equations of motion, 204 Half-band width, 146
Evolutionary computation algorithm, 117 Half-precision real number, 23
Index 401
Heaviside function, 12 Lumped mass matrix, 168

Hidden layer, 10
Hyperbolic tangent function, 50
M
Majority voting, 28
I
Mantissa part, 23
IEEE754, 23
Maple, 40
ImageDataGenerator, 27
Mass matrix, 167
ImageNet, 21
Master segment, 196
ImageNet Large Scale Visual Recognition
Material derivative, 203
Challenge (ILSVRC), 21
Mathematica, 40
Incompressibility, 220
Maxima, 40
Incompressible fluid, 205
Max pooling, 64
Inference, 22
InfoGAN, 36 McCulloch-Pitts model, 11
Input gate, 231 Mini-batch, 30, 345
Integration point, 95 Min-max problem, 32, 36
Internal energy, 205 Model averaging, 28
Inverse matrix, 295 Modified National Institute of Standards
Inverse problem, 7, 69, 76 and Technology (MNIST), 371
Isogeometric analysis, 111, 170, 188 Momentum method, 65, 326
Isoparametric element, 103, 106, 242, 294 Monotonic Upstream-centered Scheme for
Iterative method, 141, 259 Conservation Laws (MUSCL), 222
J N
Jacobian matrix, 105, 294 Navier-Stokes equation, 204, 218, 263, 265
Jagged Diagonal Storage (JDS), 148 Neocognitron, 20
Neumann boundary condition, 4
Newton-Cotes quadrature, 101
K Newtonian fluid, 204
Kalman vortex, 222 Newton-Raphson iteration, 192
Keras, 27, 366 Node-segment algorithm, 169
Kinetic energy, 205 Nondestructive testing, 75–77
Knot insertion, 184 Normal distribution, 37
Knot line, 184 Normalization layer, 64
Knot span, 184 Numerical differentiation, 40
Knot vector, 171, 319 Numerical quadrature, 95
Kronecker’s delta, 62 NumPy, 366
Kullback Leibler (KL) divergence, 39
NURBS, 171, 316
NURBS-Enhanced FEM, 170
L
Lagrange element, 280
Lagrange polynomial, 97 O
Lame’s constant, 275 One-hot encoding, 127, 136, 372, 377, 390,
Laplacian mask, 20 392
Legendre polynomial, 95 OpenBLAS, 340
Linearly separable problem, 13 Open knot vector, 175, 320
Local search, 168 Optimal quadrature parameter, 115
Long Short-Term Memory (LSTM), 225, Optimization problem, 9
230 Output gate, 231
LSGAN, 36 Overfitting, 70
LSTM unit, 231 Overtraining, 27, 71, 237
402 Index
P Slave segment, 196

P-adaptive method, 154 Sparse matrix, 145, 146
Pandas, 366 Standardization, 30
Particle Swarm Optimization (PSO), 117 State variable, 231
Partition of unity, 177 Steady-state flow, 263
Penetration depth, 168, 170, 197 Steepest descent method, 14
Perceptron, 12 Stochastic gradient decent, 65
Perceptron convergence theorem, 13 Stochastic Gradient Descent, 326
Physics-informed neural network, 45, 72, Strain-displacement matrix, 104, 244, 286
274 Stress-strain matrix, 104, 151, 244, 286
Point-to-surface, 189 Structural identification, 75
Poisson’s equation, 220 Structural Similarity Index Measure
Pooling layer, 63 (SSIM), 236
Pooling window, 64 StyleGAN, 36
Potential energy, 205 Supercomputer, 16
Pretraining, 17 Superimposition of noise, 27
Probability distribution, 33, 37 Supervised learning, 12
ProgressiveGAN, 36 Surface segment, 185
Pseudo-compressibility, 220 Surface-to-surface, 192
Pseudo compression factor, 220 Surrogate model, 10, 77, 280
Pseudo Low Precision method (PLP), 24 Swarm intelligence, 117
Pseudo time, 220 Symbolic differentiation, 40
Python, 365
T
R Taylor expansion, 210
R-adaptive method, 154 Tensor Float 32 (TF32), 23
Random forest, 28 TensorFlow, 365
Real Time Recurrent Learning, 226 Transfer learning, 79
Recurrent neural network, 225 0–1 transformation, 30
REDUCE, 40
Regularization, 26, 70
ReLU function, 50 U
Remeshing, 154 Uniform random numbers, 119, 338
Reparameterization trick, 38 Unit, 10, 49
Representative length, 219 Upwind difference method, 222
Representative speed, 219
ResNet, 21, 374
Reverse Cuthill-McKee (RCM), 146 V
Reverse mode, 43 Vanishing gradient problem, 16
Reynolds number, 219, 263, 267 Variational autoencoder, 37
Richardson extrapolation, 153 VGG, 22, 374
RMSProp, 67 Von Mises stress, 272
RowMajor, 340 Vorticity, 222
RTRL, 226
W
S
Weight decay, 72
Segment, 184
WGAN, 36
Shape function, 103
Sigmoid function, 15, 50, 330
Significant digit, 23 Z
Sign part, 23 Zooming method, 269
Simultaneous linear equations, 141, 143 ZZ method, 152
Single-precision real number, 23

Computational Mechanics With Deep Learning: Genki Yagawa Atsuya Oishi

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Computational Mechanics With Deep Learning: Genki Yagawa Atsuya Oishi

Uploaded by

Copyright:

Available Formats

Lecture Notes on Numerical Methods

in Engineering and Sciences

ISSN 1877-7341 ISSN 1877-735X (electronic)

It is well known that various physical, chemical, and mechanical phenomena in

Computational Mechanics with Deep Learning

Structure of This book

The present book is written from the standpoint of integrating computational

Tokyo, Japan Genki Yagawa

We would like to express our gratitude to Y. Tamura, M. Masuda, and Y. Nakabayashi

Tokyo, Japan Genki Yagawa

3 Computational Mechanics with Deep Learning . . . . . . . . . . . . . . . . . . 75

Part II Case Study

5.5.2 Training Phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158

8.3.1DiscretizationNet Based on Conditional

Part III Computational Procedures

Abstract In this chapter, we provide an overview of deep learning. Firstly, in

1.1 Deep Learning: New Way for Problems Unsolvable

Deep learning could be said as a revision of feedforward neural networks. Both of

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 3

Fig. 1.1 Square plate under tensile force

Equations of equilibrium at the load boundary:

Equations at the displacement boundary:

[K ]{U } = {F} (1.1.4)

(square hole) (triangular hole)

(a) (b) (c)

Fig. 1.2 Square plates with embedded holes

Fig. 1.3 Estimation of

known as (u 1 , v 1 ), . . . , (u 4 , v 4 ). Find the shape and the position of an unknown

Apparently, neither the experiment-based nor the equation-based numerical solu-

Similarly, Eq. (1.1.6) can be written in concise form as follows:

Employing repeatedly the equation-based method to find the displacements

{HoleParams(1), ((u 1 (1), v1 (1)), . . . , (u 4 (1), v4 (1)))}

Now, let H () be an arbitrary function (mapping) with

As H (), which minimizes L, corresponds to the map h in Eq. (1.1.8), it is expected

This approach is considered to be a solution method that attempts to derive a

Fig. 1.4 Estimation of

the position of a hole in the domain that minimize (v1 − v4 )2 + (v2 − v3 )2

The above problem, minimizing or maximizing some value, is that we often

Here, G() outputs the displacements (u 1 , v1 ), . . . , (u 4 , v4 ) for the given

1.2 Progress of Deep Learning: From McCulloch–Pitts

Fig. 1.5 Unit

Fig. 1.6 Feedforward neural

1980 Neocognitron [18]

Fig. 1.7 Mathematical

Later, the perceptron was introduced in 1958, demonstrating the ability of

u = w T x i = w0 + w1 xi1 + w2 xi2 + · · · + wd xid (1.2.3)

Fig. 1.8 Perceptron

As learning in the perceptron model can be regarded as the process of learning

where α is a positive constant. For x i ∈ C1 , we have

(a) Linearly separable (b) Linearly inseparable

Fig. 1.10 Sigmoid function 1.2

Here, α is a positive constant called the learning coefficient.

Fig. 1.11 Development of (GFLOPS)

Fig. 1.12 Autoencoder

In pretraining with autoencoders, the connection weights of the multilayer feed-

Fig. 1.13 Pretraining using autoencoders

Fig. 1.14 Convolutional neural network

Fig. 1.15 Matrix of weights

Fig. 1.16 Examples of

As the performance improvement was known to be achieved by adding more

1.3 New Techniques for Deep Learning

1.3.1 Numerical Precision

sign (1bit) exponent fraction