You are on page 1of 17

Deep learning approaches for thermographic

imaging
Cite as: J. Appl. Phys. 128, 155103 (2020); https://doi.org/10.1063/5.0020404
Submitted: 01 July 2020 . Accepted: 23 September 2020 . Published Online: 16 October 2020

Péter Kovács, Bernhard Lehner, Gregor Thummerer, Günther Mayr, Peter Burgholzer, Mario
Huemer, et al.

COLLECTIONS

Paper published as part of the special topic on Photothermics

ARTICLES YOU MAY BE INTERESTED IN

Three-dimensional thermographic imaging using a virtual wave concept


Journal of Applied Physics 121, 105102 (2017); https://doi.org/10.1063/1.4978010

Photothermal testing of composite materials: Virtual wave concept with prior information for
parameter estimation and image reconstruction
Journal of Applied Physics 128, 125108 (2020); https://doi.org/10.1063/5.0016364

Linking information theory and thermodynamics to spatial resolution in photothermal and


photoacoustic imaging
Journal of Applied Physics 128, 171102 (2020); https://doi.org/10.1063/5.0023986

J. Appl. Phys. 128, 155103 (2020); https://doi.org/10.1063/5.0020404 128, 155103

© 2020 Author(s).
Journal of ARTICLE scitation.org/journal/jap
Applied Physics

Deep learning approaches for thermographic


imaging
Cite as: J. Appl. Phys. 128, 155103 (2020); doi: 10.1063/5.0020404
Submitted: 1 July 2020 · Accepted: 23 September 2020 · View Online Export Citation CrossMark
Published Online: 16 October 2020 · Publisher error corrected: 20 October 2020

Péter Kovács,1,2,a) Bernhard Lehner,3 Gregor Thummerer,4 Günther Mayr,4 Peter Burgholzer,5
1
and Mario Huemer

AFFILIATIONS
1
Institute of Signal Processing, Johannes Kepler University Linz, 4040 Linz, Austria
2
Department of Numerical Analysis, Eötvös Loránd University, 1117 Budapest, Hungary
3
Silicon Austria Labs, 4040 Linz, Austria
4
Josef Ressel Centre for Thermal NDE of Composites, University of Applied Sciences Upper Austria, 4600 Wels, Austria
5
Research Center for Non Destructive Testing, 4040 Linz, Austria

Note: This paper is part of the Special Topic on Photothermics.


a)
Author to whom correspondence should be addressed: kovika@inf.elte.hu

ABSTRACT
In this paper, we investigate two deep learning approaches to recovering initial temperature profiles from thermographic images in non-
destructive material testing. First, we trained a deep neural network (DNN) in an end-to-end fashion by directly feeding the surface tem-
perature measurements to the DNN. Second, we turned the surface temperature measurements into virtual waves (a recently developed
concept in thermography), which we then fed to the DNN. To demonstrate the effectiveness of these methods, we implemented a data
generator and created a dataset comprising a total of 100 000 simulated temperature measurement images. With the objective of determin-
ing a suitable baseline, we investigated several state-of-the-art model-based reconstruction methods, including Abel transformation, curve-
let denoising, and time- and frequency-domain synthetic aperture focusing techniques. Additionally, a physical phantom was created to
support evaluation on completely unseen real-world data. The results of several experiments suggest that both the end-to-end and the
hybrid approach outperformed the baseline in terms of reconstruction accuracy. The end-to-end approach required the least amount of
domain knowledge and was the most computationally efficient one. The hybrid approach required extensive domain knowledge and was
more computationally expensive than the end-to-end approach. However, the virtual waves served as meaningful features that convert the
complex task of the end-to-end reconstruction into a less demanding undertaking. This in turn yielded better reconstructions with the
same number of training samples compared to the end-to-end approach. Additionally, it allowed more compact network architectures and
use of prior knowledge, such as sparsity and non-negativity. The proposed method is suitable for non-destructive testing (NDT) in 2D
where the amplitudes along the objects are considered to be constant (e.g., for metallic wires). To encourage the development of other
deep-learning-based reconstruction techniques, we release both the synthetic and the real-world datasets along with the implementation of
the deep learning methods to the research community.

© 2020 Author(s). All article content, except where otherwise noted, is licensed under a Creative Commons Attribution (CC BY) license
(http://creativecommons.org/licenses/by/4.0/). https://doi.org/10.1063/5.0020404

I. INTRODUCTION characterization,2,3 and thermal imaging in medicine.4–6,45 In


Analysis of structural imperfections of materials, spare parts, thermographic imaging, the specimen is heated by flashlamps,
or components of a system is important in preventing the mal- lasers, etc., and the corresponding temperature evolution is then
function of devices. This can be achieved by active thermography, measured on the surface. The resulting thermal pattern is used to
which has—in addition to non-destructive testing—several other reconstruct the heat distribution inside the material, which pro-
applications, such as structural health monitoring,1 material vides the main information for defect detection.

J. Appl. Phys. 128, 155103 (2020); doi: 10.1063/5.0020404 128, 155103-1


© Author(s) 2020
Journal of ARTICLE scitation.org/journal/jap
Applied Physics

In recent decades, thermographic data evaluation was domi- comes at the price of reconstruction accuracy, which is low for
nated by 1D methods, which became inaccurate, for example, when noisy measurements. The variational approach provides a good
applied to anisotropic heat flow. For more accurate thermographic compromise between over- and under-fitting of the data. In this
data evaluation, multidimensional heat flow must be considered. case, a priori information, such as smoothness, sparsity, group
Some research groups have focused on solving multidimensional sparsity, and non-negativity, can be incorporated into the model
inverse heat conduction problems (IHCPs) using various thermal to penalize unfeasible solutions. Although these algorithms are
stimulation methods.7,8 The virtual wave concept9 is another more robust against noise, the penalty function must satisfy
approach that considers multidimensional heat flow without certain conditions, such as convexity, which limits the eligible
solving directly an IHCP. Since it does not add additional informa- prior knowledge. Another drawback is that the considerably
tion for detecting defects, for a single one-dimensional (1D) recon- higher computational complexity usually prevents real-time
struction, the spatial resolution does not improve compared to imaging. Furthermore, in many applications, it is not obvious
direct inversion of the heat diffusion. The advantage of the virtual how a suitable regularization parameter is to be chosen.
wave concept in 1D is that more advanced regularization techni- Deep learning provides a good alternative to direct and varia-
ques that incorporate a priori information, such as sparsity or posi- tional methods for mainly two reasons: first, the main computa-
tivity, can be utilized10,11 in the reconstruction process. The main tional load of these algorithms is shifted to the offline training
benefit in 2D or 3D is that for the calculated virtual waves, any phase. This permits real-time imaging in the testing phase, even
conventional ultrasound reconstruction method, such as the syn- with embedded hardware. Second, since a priori information is
thetic aperture focusing technique (SAFT),12,13 can be used in a implicitly encoded during the training phase, it need not be identi-
second step to reconstruct the defects.9,14 The signal-to-noise-ratio fied and explicitly incorporated into the method. Most successful
(SNR) is significantly enhanced because the lateral heat flow per- industrial applications of deep learning have used it in a super-
pendicular to the surface is also taken into account in the recon- vised manner. This approach, however, requires large amounts of
structions. This is similar to averaging numerous 1D measurements labeled data, which is not easy to obtain in thermography, as it
when imaging a layered structure, but with the essential advantage requires, for instance, production of physical phantoms with
that the virtual wave concept can be used for any 2D or 3D struc- varying material properties including defects with various posi-
ture to be imaged. Here, for the first time, after the virtual waves tions, sizes, and shapes. We tackled this problem by using syn-
have been calculated, acoustic reconstruction as the second step is thetic data to train a deep neural network (DNN), which was then
performed by deep learning, which shows additional advantages: applied to real-measurement data in the testing phase. In this
artifacts caused by limited view or from discretization are sup- paper, we present two training strategies. The first is carried out in
pressed, which results in more accurate reconstruction. an end-to-end fashion, which means that the surface temperature
Thermal reconstruction can be modeled mathematically by data [see, e.g., Figs. 1(a) and 1(b)] is fed directly to the network
the heat diffusion equation. Since heat diffusion is an irreversible used to predict the initial temperature profile inside the material
process, part of the information is inevitably lost, which implies [Fig. 1(d)]. The second approach uses the virtual wave concept9 as
the ill-posed nature of the problem.15 In this case, maximum like- a feature extraction step [Fig. 1(c)] before the neural network is
lihood solutions typically provide numerically unstable results, applied. This hybrid solution incorporates domain knowledge via
and regularization is needed to avoid over-fitting of the data.16 virtual waves and automatically explores the internal structure of
This can be achieved by direct methods, variational image recon- the data, such as sparsity.
struction, and deep learning. The first approach is based on the We demonstrate empirically that this pre-processing method
approximate inversion of the problem and is fast because it removes irrelevant information from the surface temperature data.
involves only a few matrix-vector multiplications. However, this Compared to the end-to-end approach, we achieved the same

FIG. 1. Two-stage reconstruction process of thermographic images: (a) and (b) surface temperature measurements of specimens with different thermal diffusivities but the
same defects; (c) corresponding virtual waves; and (d) defect locations.

J. Appl. Phys. 128, 155103 (2020); doi: 10.1063/5.0020404 128, 155103-2


© Author(s) 2020
Journal of ARTICLE scitation.org/journal/jap
Applied Physics

performance with either fewer training samples or a more compact where the local transformation kernel K can be written as
architecture with fewer trainable parameters. We extended our  2 02 
recent work17 on this topic by investigating the end-to-end c ct
approach and various ways of extracting virtual waves and by K(t, t 0 ) ¼ pffiffiffiffiffiffiffi exp  (t . 0): (4)
παt 4αt
testing the proposed methods on both synthetic and real-world
measurement data. For the latter, we performed a qualitative analy- A two-stage reconstruction process can now be defined in which
sis of the reconstructed temperature profiles and compared them to the first step is to solve Eq. (3), followed by an ultrasonic evaluation
those of other model-based approaches. Furthermore, we provide a method for reconstructing T0 from Tvirt in the second step.
quantitative assessment of the baseline, hybrid and end-to-end In this paper, we consider 2D reconstruction problems (see,
approaches in 2D thermographic reconstruction. e.g., Fig. 1), where the temperature evolution T(r, t) ¼ T(y, z, t) in
The paper is organized as follows. In Sec. II, we review the time t is measured on the surface z ¼ 0 by an infrared camera at
two-stage reconstruction process and investigate various model- different locations along the y axis. The corresponding measure-
based methods that can be applied in each step. The data, neural ment data are an image in 2D that is stored in a matrix T [
network architectures, and training regime are presented in RNt Ny for estimating the initial temperature profile T0 [ RNz Ny ,
Secs. III A–III C, respectively. We then describe the end-to-end where Nt , Ny , and Nz denote the proper number of discretization
and the hybrid deep learning approach in Secs. III C 2 and III C 1, points along the temporal, lateral, and depth dimensions, respec-
where we also investigate the effect of various training data sizes. tively. The discrete analog of the two-stage reconstruction process
The results of the experiments on unseen synthetic and real-world can be defined by the following regularized linear inverse problems:
data are presented in Secs. IV and V, respectively. This is followed
by a discussion of the results and of the validity of the proposed v ¼ arg min {kd  Kvk22 þλ2  Ω(v)},
e (5)
methods in Sec. VI. Finally, Sec. VII concludes the paper with v

results and future research directions.


e v  Muk22 þμ2  Ω(u)},
u ¼ arg min {ke (6)
u
II. MODEL-BASED APPROACH USING TWO-STAGE
RECONSTRUCTION where Ω() stands for the penalty function and λ, μ  0 are reg-
Thermal reconstruction can be modeled mathematically by ularization parameters. Here, the bold-face notation indicates
the heat diffusion equation, that we are working in the dimension of vectorized 2D mea-
surements, that is, d ¼ vec(T) [ RNt Ny 1 and K [ RNt Ny Nt Ny .
  K ¼ diag(K, K, . . . , K) is a blockdiagonal matrix, where the
1@ 1
∇2  T(r, t) ¼  T0 (r)δ(t), (1) kernel in Eq. (4) is evaluated at discrete time instances, which means
α @t α
that Ki,j ¼ K(ti , tj ) for i, j ¼ 0, . . . , Nt  1. The virtual wave vector
where α stands for the thermal diffusivity, T is the temperature as a e
v  vec(Tvirt ) represents the approximated solution to Eq. (2), with
function of space r and time t, and T0 denotes the initial tempera- p0 / T0 and an arbitrarily chosen dimensionless virtual wave speed c
ture profile at t ¼ 0, which is given by the temporal Dirac delta that we set to 1. In Eq. (6), M [ RNt Ny Nz Ny represents the matrix
function δ on the right-hand side. Except for some special cases, form of conventional ultrasound reconstruction methods, such as the
such as adiabatic boundary conditions (see Sec. III A), there is no SAFT techniques, which are used to estimate the vectorized initial
exact solution to Eq. (1), and numerical approximations must be temperature profile e u  vec(T0 ) [ RNz Ny 1 .
applied. Therefore, rather than solving Eq. (1) directly, we consider The solutions to Eqs. (5) and (6) depend on many factors, such
the problem of undamped acoustic wave propagation described by as the numerical solver, the penalty function, the optimization con-
the wave equation, straints, and the regularization parameters. In Secs. II A and II B, we
investigate these aspects in order to find the most suitable method
  for the two-stage thermographic reconstruction process in 2D.
1 @2 1 @
∇2  p(r, t) ¼  2 p0 (r)δ(t), (2)
c2 @t 2 c @t
A. First stage: Virtual wave reconstruction
where c is the speed of sound, p describes the acoustics pressure in In the first stage, the virtual waves ev are to be estimated via
the medium under investigation, and p0 is the initial pressure dis- Eq. (5), which is derived from the discretization of the Fredholm
tribution at t ¼ 0 just after the Dirac-like excitation impulse. integral equation in Eq. (3). This leads to a discrete ill-posed
According to Burgholzer et al.,9 the thermographic reconstruction inverse problem.18,19 Regularization including side constraints, such
in Eq. (1) can be converted to an ultrasound reconstruction as sparsity and non-negativity, is, therefore, inevitable in order to
problem by substituting p0 / T0 in Eq. (2). The solution to this obtain feasible solutions.
equation p / Tvirt is called virtual wave, and its relation to the orig- Since the virtual waves represent pressure signals, they can have
inal temperature distribution T can be formulated as a Fredholm both positive and negative values. We thus need to find an invertible
integral equation of the first kind, linear transformation that maps e v onto a space where non-negativity
ð1 constraints apply. According to Thummerer et al.,10 this can be done
K(t, t 0 )Tvirt (r, t 0 ) dt 0 ¼ T(r, t), (3) by spherical projections that correspond to time integration in 3D
1 and to Abel transformation in 2D. Let us denote these

J. Appl. Phys. 128, 155103 (2020); doi: 10.1063/5.0020404 128, 155103-3


© Author(s) 2020
Journal of ARTICLE scitation.org/journal/jap
Applied Physics

transformations by R, and define K b ¼ KR1 . The non-negative and alternating direction method of multipliers (ADMMs)22 in conjunc-
sparse variation of Eq. (5) can then be written in the form, tion with the L-curve23 method to solve the corresponding sparse
approximation problems and to estimate λ in Eqs. (7) and (8),
  respectively. The reconstruction error between the ground truth Tvirt
v ¼ arg min {d  Kv
b b 2 þ λ2  kvk1 }:
2
(7) evirt ¼ vec(b
0v and the reconstructed virtual waves (i.e., T v ) for Abel trf.
or Tevirt ¼ vec(e
v ) for curvelet trf.) were measured in terms of the
Once the minimizer b v has been found, the original virtual waves mean squared error (MSE),
vec(Tvirt ) can be approximated by R1b v.
1 X Nt X
Another way to sparsify the virtual wave vector is to use multi- yN

scale and multidirectional representations, such as curvelets. Unlike MSE ¼ evirti,j )2 :


(Tvirti,j  T (9)
Nt Ny i¼1 j¼1
wavelets, this transformation applies a polar tiling pattern in the
frequency domain, where each wedge represents the frequency
support of the corresponding curvelet element. In the spatial Figure 3(a) shows the MSEs averaged over 1000 sample images for
domain, the basis functions look like elongated ridges, which are each method and SNR. Except in the worst-case scenario with
smooth in one direction and oscillatory in the other. Due to their 20 dB SNR, the sparse virtual wave reconstruction with Abel
similar structure, acoustic waves become sparse in the curvelet transformation outperformed the curvelet-based approach.
domain, which is illustrated in Fig. 2. Therefore, we conclude that in this case, the non-negativity is a
Motivated by their usefulness in seismology,20 we utilize cur- much tighter constraint than the higher level of sparsity assumed in
velets to recover the virtual wave field as follows: the curvelet domain.

 2
v ¼ arg min {d  Kv 2 þ λ2  kCvk1 }:
e (8) B. Second stage: Initial temperature profile
v
reconstruction
Here, the curvelet transformation matrix C is not explicitly formed; The outcome of the first reconstruction step is an approxima-
instead, we apply the fast discrete curvelet transformation (FDCT),21 tion to Tvirt , which represents undamped acoustic waves that would
which requires O(n2 log n) flops for images of size n  n. Therefore, be measured on the surface right after temporal excitation.
this approach is also suitable for large-scale problems if the virtual Therefore, we must deal with an ultrasound reconstruction
waves are reconstructed from high-resolution thermal images. problem in the second stage, which is described by Eq. (6).
We analyzed the performance of the virtual wave reconstruc- In ultrasonography, a point source has a hyperbolic response
tion methods utilizing Abel and curvelet transformation. To this that depends on both the depth and the speed of sound in the
end, we used a test set from our previous work,17 which consists of specimen. Therefore, the forward operator M [ RNt Ny Nz Ny is
1000 samples in ten versions for various SNRs, thus comprising designed such that these diffraction hyperbolas are assigned to the
10 000 samples in total. These data simulate thermally isolated corresponding point sources, while the inverse operator Mþ col-
specimens in 2D and can be modeled by assuming adiabatic boun- lapses these hyperbolas back to point sources. In this work, Mþ is
dary conditions in Eq. (1). Both the virtual waves Tvirt and the provided by the well-known Stolt’s f-k migration24 and the
initial temperature distribution T0 can thus be calculated analyti- T-SAFT25 methods, which allow reconstruction in the frequency
cally. For instance, the thermal and virtual wave images in Fig. 1 and the time domain. The initial temperature profile can then be
were simulated this way. Based on recent results,10 we used the estimated directly by e u  Mþe v. We hereafter refer to these algo-
rithms as fkmig and tsaft.
One difficulty in applying variational methods here is the lack
of proper inversion for Mþ . However, it has been shown that the
adjoint operator can be used for this purpose.16,26 Therefore, in
Eq. (6), we chose M to be equal to the adjoint of the T-SAFT
matrix.25 Recall that the blockdiagonal structure of K allows appli-
cation of singular value decomposition (SVD) of the kernel K in
Eq. (5), which enables estimation of the optimal λ via the L-curve
method23 in the first stage. However, in the second stage, calculat-
ing the SVD is unfeasible because the matrix M is too large in the
case of T-SAFT or it is not explicitly formed, as in the case of f-k
migration. Hence, we use the IRfista numerical solver, which has
recently been developed for large-scale linear inverse problems,27
where regularization is achieved via semi-convergence of the itera-
tions. Due to physical constraints, we assume that the initial tem-
perature is non-negative (i.e., u  0), and we set Ω(u) ¼ kuk2 in
FIG. 2. Sparse curvelet approximation of virtual waves. (a) Simulated virtual Eq. (6). We hereafter refer to this setup as reg tsaft.
waves with additive white Gaussian noise; (b) virtual waves approximated by
keeping less than 1% of the curvelet coefficients.
Note that nonzero elements of u represent either noise or
defects, but that the latter show up in groups. Assuming that the

J. Appl. Phys. 128, 155103 (2020); doi: 10.1063/5.0020404 128, 155103-4


© Author(s) 2020
Journal of ARTICLE scitation.org/journal/jap
Applied Physics

FIG. 3. Comparing various model-based approaches for reconstructing (a) virtual waves and (b) initial temperature profiles.

volume of the defects is negligible compared to the full volume of as described in Sec. II A. This was followed by reconstruction of the
the material, sparsity and group sparsity can also be imposed on u. initial temperature distribution T0 in the second stage using tsaft,
This naturally raises the question of why not to use a penalty, such reg tsaft, and group sparse grp.-tsaft, as shown in Figs. 4(b)–4(d),
as Ω(u) ¼ kuk1 , that utilizes this knowledge in Eq. (6). The answer respectively. Among these methods, the group sparse approximation
is twofold. One, the approximation of the virtual wave vector e v is seemed to be the best, but this approach required more than 150
influenced by both the measurement noise and the regularization iterations using the SPGL1 sparse numerical solver.31,32 Note that
error of the first step in Eq. (5). As a consequence, estimating the for this single example, we tried several parameter setups to find the
optimal regularization parameter for each measurement is an optimal values for the regularization parameter λ and for the group
intractable task. Two, group sparse optimizers28,29 usually require a sizes. Since this procedure would be intractable for large datasets,
priori knowledge of the group boundaries, and their performance is we omitted this approach from our extensive comparative study.
also limited by the number of groups.30 None of these variational We tested and compared the performances of the previously
approaches provides the necessary level of freedom for detecting mentioned model-based approaches on the dataset17 described in
defects with arbitrary shapes, locations, and overlap in large-scale Sec. II A. Since the Abel-transformed ADMM showed the best per-
linear inverse problems. formance in the first stage, we used this method to extract the
Figure 4 illustrates the previously mentioned reconstruction virtual waves. We then applied fkmig, tsaft, and reg tsaft to approxi-
techniques: we first approximated the ground truth virtual waves mate T0 in the case of varying noise levels. Figure 3(b) shows that
Tvirt by applying ADMM in conjunction with Abel transformation fkmig achieved the lowest reconstruction error in terms of MSE.

FIG. 4. Second stage of the reconstruction process. (a) Virtual wave reconstruction by ADMM with Abel trf.; initial temperature distribution by (b) tsaft, (c) reg tsaft, and (d)
group sparse grp. tsaft, where groups of size 10  10 were used as indicated by the black grid.

J. Appl. Phys. 128, 155103 (2020); doi: 10.1063/5.0020404 128, 155103-5


© Author(s) 2020
Journal of ARTICLE scitation.org/journal/jap
Applied Physics

Therefore, we chose this model-based reconstruction procedure as Fourier number ΔFo = α  Δt =Δ2z was chosen to be 0:45, where Δt is
the baseline for the proposed deep learning approach. the temporal resolution and Δy ¼ Δz is the spatial resolution of y
and z. These surface temperature measurements were later used as
III. APPROACHES USING DEEP LEARNING input to the end-to-end approach. Note that generating new training
data for different thermal diffusivities was not necessary, because we
In this section, we describe two approaches to tackling could easily rescale the temporal and spatial resolution to meet the
thermal reconstruction that build on the same architectures to discrete Fourier number of the training data. For our hybrid method,
allow direct comparison (see Fig. 5). First, we trained deep neural we computed the virtual waves from the temperature measurements
networks in an end-to-end fashion. That is, we directly fed the as described in Sec. II A (i.e., by using ADMM with Abel transfor-
surface temperature data to the network. Second, we utilized the mation10). We highlight that the estimated virtual wave vector e v is
virtual wave concept9 as a feature extraction step. In this case, we independent of the thermal diffusivity α but depends on the dimen-
fed the resulting mid-level representation to the neural networks. sionless speed of sound c, which was chosen to be 1.
The end result for each sample comprised three single-channel
A. Data images 256 by 64 pixels in size: the temperature measurements, the
virtual waves, and the target mask.
Deep learning approaches require vast amounts of data to
Additionally, we used ten different versions of each sample, rep-
learn the target distribution. This also applies to thermography,
resenting SNRs from 20 dB to 70 dB in 10 dB steps. On the one
where the data depend on many factors, such as thermal diffusivity,
hand, this is a form of data augmentation that is supposed to
parameters of the defects, and measurement setup. As covering all
increase robustness against changes in the level of SNR during train-
possible variations is impossible, we created the training set by
ing. On the other hand, having multiple versions of a sample allows
using simulated data only. In this work, we considered Eq. (1)
more detailed performance evaluation and baseline comparison.
assuming adiabatic boundary conditions, since there is an analytic
We divided these data into three disjoint (non-overlapping)
solution to this particular case,
subsets as follows: our training data consisted of 8000 samples. We
normalized the samples to have zero mean and unit standard devi-
T(k b0 (ky , kz )  exp( (k2 þ k2 )  αt),
b y , kz , t) ¼ T (10)
y z ation using only the training data. Given that we had ten different
versions of each sample that represented different SNR levels, we
where T b and Tb0 denote the Fourier cosine transforms of T and T0
ended up with 80 000 samples.
in the yz-plane, and ky and kz are the corresponding spatial frequen- For development and validation purposes, such as architecture
cies. In our experiments, the amplitude along the defects was consid- engineering, hyperparameter tuning, and model selection, we
ered to be constant. Therefore, to simulate the surface temperature always used the same 1000 samples. In total, we had 10 000
data, we first generated T0 as a binary image of size Nz  Ny . This samples in our validation data.
was used to calculate Tb by Eq. (10) and then T by taking the inverse
In order to estimate the generalization capabilities and to
Fourier cosine transform of T. b Evaluating T(y, z, t) at z ¼ 0 gave
ensure a fair comparison to the baseline, we evaluated on an addi-
the simulated surface temperature data. tional dataset. This test dataset was unseen in every regard and
Our complete data are based on 10 000 different samples also normalized based solely on the training data. It also consisted
with up to five square-shaped defects with side lengths between of 1000 samples in ten versions for different SNRs, thus comprising
two and six pixels. a total of 10 000 samples.
First, we generated a binary target mask by randomly posi-
tioning the defects for each sample. Then, the corresponding
B. Network architectures
surface temperature measurements were simulated by assuming
adiabatic boundary conditions (i.e., no heat can flow in or out We propose two carefully designed u-net architectures, where
from the specimen; see, e.g., Sec. III in Ref. 9). The discrete we used the implementation from Ref. 33 as a starting point. The first

FIG. 5. Deep learning approaches to thermographic image reconstruction. Rescaling was applied only to the real-world measurement data in order to match the resolution
of the training images.

J. Appl. Phys. 128, 155103 (2020); doi: 10.1063/5.0020404 128, 155103-6


© Author(s) 2020
Journal of ARTICLE scitation.org/journal/jap
Applied Physics

architecture was developed to be very compact and computationally


inexpensive. The resulting model is best suited to embedded and/or
real-time settings. The second architecture was developed with the
goal of maximizing performance. While the computational cost was
ignored in this case, it is still relatively small for modern standards.
The u-net architecture attracted considerable attention by
winning the International Symposium on Biomedical Imaging
(ISBI) cell tracking challenge in 2015. Initially, it was proposed for
various medical image segmentation tasks.34 One major advantage
of this architecture is that it can also produce adequate results with
moderate amounts of training data. The underlying basic principle
is that of a fully convolutional (i.e., without any fully connected
layers) auto-encoder with a contracting path (i.e., encoder) and an
expansive path (i.e., decoder). In the contracting path, the feature FIG. 6. End-to-end approach results with increasing training dataset size for
map resolution is successively reduced due to the effect of pooling. both architectures. cmp train/cmp val: training and validation data loss for the
In the expansive path, pooling operators are replaced by upsampling compact model (109 000 weights); lrg train/lrg val: training and validation data
operators, which successively increases the resolution. loss for the larger model (1:8  106 weights). Error bars correspond to the
standard deviations of the results after training five times.
The ability to localize is realized via skip-connections from
the higher-resolution features of the contracting path to the
upsampled output of the expansive path. For a more detailed
description, we refer to the original paper.34 and standard deviations. Note that we always used the exact same
The compact architecture was designed and has a depth of validation data as described in Sec. III A.
three in both the contracting and the expansive path. It has just 16 The outcome of this experiment is summarized in Fig. 6,
filters in the first (single channel) layer, which results in about which shows a side-by-side comparison of the two architectures.
109 000 weights. The second, larger architecture is very similar, We refer to the compact and to the larger architecture as cmp and
except that it has an increased depth of five layers, which amounts lrg, respectively. As can be seen, increasing the number of training
to about 1:8  106 weights. Further increasing the number of filters samples resulted in models with lower loss and smaller generaliza-
and/or the depth of the network did not lead to improved perfor- tion gap (i.e., the difference between training and validation loss).
mance in terms of validation loss. Additionally, the compact model fits the training and validation
data less well than the complex model but also produced less over-
C. Training fitting. This is due to the low number of learnable weights, which
acts as a regularizer. The compact architecture seems to give consis-
For training of the weights, we use the same procedure for
tent results, as the standard deviation of both the training and vali-
both models to minimize the binary cross entropy (BCE) loss as
dation results was relatively small (see cmp train/cmp val). The
follows. We used the Adam optimizer35 with a learning rate of
generalization gap vanished when 40 000 and 80 000 training
1  103 and no weight decay. We trained for 100 epochs and fast
samples were used.
convergence rendered a learning-rate schedule unnecessary.
The results of the larger architecture are similar, except the loss
Furthermore, we did not apply batch normalization, as several
was generally lower than for the compact architecture. Since the gen-
experiments had shown that it was not useful.
eralization gap also remained when training with 80 000 samples,
The only data augmentation we applied during training was a
increasing the training set size even further might be useful.
simple left-right flip of both the input image and the target mask
with a probability of 50%. Notice that many augmentations known
to be helpful in other image-related tasks are not useful here. For 2. Training with virtual waves
instance, stretching or shearing applied to an input and target
In this section, we investigate the performance of a hybrid
image would not simply result in a consistent input-target pair.
approach. Here, the goal was to learn the reconstruction process
from e v to eu, which also incorporates the previously mentioned a
1. Training end-to-end priori knowledge. The input of the network was provided by the
In this section, we investigate the performance of an first regularization step in Eq. (5). The original end-to-end recon-
end-to-end approach. We fed the temperature measurements struction problem from d to e u was converted to a much easier task,
directly to the neural networks. This approach has the least compu- that is, recognition of hyperbolas in e
v.
tational load, but at the same time the highest task complexity. The setup of the experiment was the same as with the
For additional insights, we trained both architectures with end-to-end approach, except that we used virtual waves instead of
four different amounts of training data, starting with 10 000 temperature measurements as input. Note that, to achieve the most
samples. We then increased the size of the training dataset repeat- meaningful comparison possible, all the samples used for training
edly by a factor of two until we ended up with the full set compris- and validation matched exactly those used in the end-to-end
ing 80 000 samples. In order to increase the meaningfulness of the approach. Again, we repeated each training five times and report
results, we repeated each training five times and report the means the means and standard deviations.

J. Appl. Phys. 128, 155103 (2020); doi: 10.1063/5.0020404 128, 155103-7


© Author(s) 2020
Journal of ARTICLE scitation.org/journal/jap
Applied Physics

FIG. 7. Hybrid approach results with increasing training dataset size for both
architectures. cmp train/cmp val: training and validation data loss for the
compact model (109 000 weights); lrg train/lrg val: training and validation data FIG. 8. A challenging 0 dB SNR example reconstructed by the baseline fkmig
loss for the larger model (1:8  106 weights). and our proposed methods (MSE in brackets).

Figure 7 shows a side-by-side comparison of the two archi- making it easier to find a good threshold for the final binary defect/
tectures. In general, the results reflect model behavior similar to no defect decision.
that of the end-to-end approach, except that the loss was consis- Notice that e2e lrg identifies the “shadowed” defect closer to
tently lower. This was expected, as the hybrid approach takes the expected area without any false positive artifacts, although the
some of the computational burden away from the neural network MSE is larger than that of hybrid lrg. This might seem counterintu-
compared to the end-to-end approach. Therefore, virtual waves itive at first, but can be attributed to the fact that a visual—hence
seem to be a useful mid-level representation for learning the subjective—inspection of the results is prone to missing smaller dif-
reconstruction process. ferences from the target that still contribute to the objective error
metric. For instance, in hybrid lrg of Fig. 8, three out of five defects
D. Model selection for baseline comparison have almost perfect reconstruction. In the case of e2e lrg, the same
defects have blurred edges and slightly different amplitudes, which
After training, we selected the best models for each architec-
increase the overall MSE of the reconstructed image.
ture and both the hybrid and the end-to-end approach according
Depending on the specific application, it might be the case
to the validation data results. The resulting four models were then
that either the absence of artifacts, or the correct position, shape, or
evaluated on the unseen test data against the baseline.
size of the defect is the most important criterion for evaluation.
Therefore, a different evaluation metric might be required.
IV. SIMULATION RESULTS Combined with other techniques, such as uncertainty estimates of
In this section, we discuss the results of the end-to-end and the predictions,36 a practically useful and efficacious system can
the hybrid approaches on the unseen test set and compare them to then be engineered.
the baseline (see Sec. II B). Next, we present the results of a more objective evaluation.
We start by providing intuitive insights by comparing the Figure 9 shows the results of the baseline fkmig and of our pro-
reconstructions of our approaches with the baseline method. For posed methods in terms of the MSE for various SNRs.
this, we take one of the hardest examples from the 0 dB SNR Unsurprisingly, for all methods lower SNRs led to worse recon-
results. In addition to the low SNR, two other points make this structions and thus to higher losses. As can be seen, even the com-
example challenging. First, several defects are in close proximity putationally cheapest method e2e cmp provided a substantial
and risk becoming merged together. Second, there is a defect improvement over the baseline. Overall, the results confirm two
located in between and underneath two other defects, which makes findings from the validation results. First, the larger models hybrid
it difficult to detect. lrg, e2e lrg performed consistently better in terms of MSE com-
Figure 8 shows the target (mask) in the top-left and the base- pared to their compact counterparts hybrid cmp, e2e cmp. Second,
line reconstruction fkmig in the top-right subplot. Compared to the the hybrid models hybrid cmp, hybrid lrg performed consistently
baseline, all four of our proposed approaches produced reconstruc- better in terms of MSE compared to their end-to-end counterparts
tions that were much closer to the desired result, even e2e cmp, the e2e cmp and e2e lrg.
computationally cheapest one. Furthermore, only the baseline Interestingly, between 30 dB and 70 db SNR, both hybrid
method seemed to merge several defects together. Detecting the models exhibited improving performance, whereas both end-to-end
“shadowed” defect near the lower right corner was difficult for all models stayed approximately at the same level of loss. This is
methods. In general, only the larger models were able to detect the another indicator that the virtual waves represent the information
defects deeper in the material (the lower, the deeper). Furthermore, of the temperature measurements in a way that is easier to process
our deep-learning-based methods produced almost no artifacts, by the neural networks. In order to explain this phenomenon in

J. Appl. Phys. 128, 155103 (2020); doi: 10.1063/5.0020404 128, 155103-8


© Author(s) 2020
Journal of ARTICLE scitation.org/journal/jap
Applied Physics

lower in magnitude by a factor of approximately 1000 (see the MSE


in brackets in Fig. 10). As the input to the end-to-end approach is
extremely similar, the neural network also produces similar output,
which is why the end-to-end results do not improve above 30 dB
SNR anymore. This further demonstrates the effectiveness of the
hybrid approach that incorporates domain knowledge.

V. REAL-WORLD EXAMPLE
The experiments described so far were based on synthetic
data only. However, we cannot simply assume that these results
translate to the physical world, especially when we also use synthetic
data for training, as was the case with the deep-learning-based
methods. The generative process for the synthetic data might not
incorporate all relevant aspects, rendering the results less relevant to
practical applications.
FIG. 9. Comparison of baseline fkmig and our proposed methods for various Additionally, confounding factors could contribute to the
SNRs. problem that a seemingly good performance of a machine learning
system might not be the consequence of actually understanding the
underlying concepts that determine the target.37–40 Therefore, we
more detail, we present Fig. 10, depicting measurements (left considered it extremely important to provide additional results
column) and their corresponding virtual waves (right column) using real-world specimens.
from the same sample, but assuming different SNR conditions. The However, for the task at hand, real-world data are hard to
plots at the bottom represent their corresponding squared differ- come by mainly for two reasons. First, we are often blind to the
ence. In order to make the squared differences visible to the naked objective ground truth of the real-world specimen, as it would
eye, we had to upscale their magnitude by a factor of 10 000 and 10 require destruction of the specimen itself in order to obtain it.
for the measurements and the virtual waves, respectively. Thus, we do not report the MSE as we did with the synthetic data,
Clearly, the virtual waves differ, with the 70 dB representation but demonstrate the differences between the methods by showing
looking less blurry than the 30 dB representation, especially toward their outputs along with an estimate of the ground truth. Second,
the bottom of the image. Additionally, the squared difference shows a creating data in the physical world are more complex compared to
meaningful pattern—information which potentially enables the neural collecting and annotating samples, for instance, for an image classi-
network to produce ever better results as the SNR conditions improve. fication task. Therefore, the following results are based on a single
On the other hand, the squared difference of the measure- specimen, and we took measurements from various angles to
ments does not show such an obvious pattern and is significantly increase their meaningfulness.
First, a physical epoxy resin phantom with two embedded
steel rods was created as our real-world specimen. The rods were
heated up by eddy current induction for 2 s using an induction
generator that provided 3 kW power at a frequency of 200 kHz. The
resulting temperature evolution was then measured on the surface
by means of an infrared camera. The thermal diffusivity parameter
of the material was α ¼ 8:14  108 m2 =s [see Eq. (4)]. Additional
physical parameters of the specimen and the measurement setup
are shown in Figs. 11(a) and 11(b), respectively. Note that we can
simulate internal heat sources within the test sample, since the
epoxy resin has no transparency (i.e., it is opaque) in the infrared
spectral range. Therefore, we can only measure the surface temper-
ature and reconstruct the internal heat sources from it.
Figure 12(a) shows the measured temperature data at the
sample surface with the highest contrast. It is clearly visible that the
lower regions of the rods cannot be detected due to the low SNR.
Figure 12(b) presents a snapshot of the measurement along the
FIG. 10. Comparison of measurements (left column) and virtual waves (right vertical diameter of the specimen at a spatial resolution of
column) for different SNRs from the same sample. The squared difference Δx ¼ 0:04=238 m and a temporal resolution of Δt ¼ 0:04 s. As can
(upscaled for clarity, true MSE in brackets) between the two virtual waves be seen, some vertical and horizontal lines, which are caused by the
seems to contain more meaningful information than the squared difference pixel crosstalk of the infrared camera,41 distort the original image.
between the measurements, allowing the neural network to further improve
above 30 dB SNR when fed with virtual waves.
This pattern noise was removed by eliminating the characteristic
frequencies indicated by the red lines in Fig. 12(c). Then, the

J. Appl. Phys. 128, 155103 (2020); doi: 10.1063/5.0020404 128, 155103-9


© Author(s) 2020
Journal of ARTICLE scitation.org/journal/jap
Applied Physics

them to the real-world data. We considered four variations of the


measurement data. In the first case, the cross sections were perpen-
dicular to the orientation of the steel rods, while we rotated the
specimen by 10 , 25 , and 45 in the other examples.
The results of the model-based fkmig approach can be seen in
Figs. 13–16. We reconstructed the volume of the steel rods (red)
and also show the estimated ground truth (blue) with the orienta-
tion of the cross sections (green). As can be seen, the two steel rods
deeper inside the specimen do not appear as separated defects
anymore. This effect becomes more pronounced on the rotated
examples, and artifacts on the edge of the specimen begin to
appear, which is especially visible in Fig. 16, which shows the results
of rotating the specimen by 45 . Note that the results of the fkmig
approach were so noisy that postprocessing was necessary: We
zeroed out those pixels in the reconstructed image whose values
were below a certain threshold. Tuning this threshold was a difficult
task for the fkmig approach. For the sake of simplicity, we set a
uniform threshold of 200 for all reconstructions in Figs. 13–24.
The end-to-end approach e2e lrg yielded meaningful 3D
reconstructions, as can be seen in Figs. 17–20, where the two
defects stay clearly separated except at the edges of the specimen.
Again, thresholding was necessary to eliminate some artifacts from
FIG. 11. Real-world experiment. (a) Physical properties of the specimen and (b)
measurement setup.
the figures, but the end-to-end method turned out to be relatively
insensitive to the value of the threshold. Compared to the model-
based fkmig approach, the results seem to be affected less by rotat-
ing the specimen, and fewer artifacts appear. However, the volume
filtered measurement data were used to extract the virtual waves of the steel rods was underestimated. Interestingly, this approach
and to reconstruct the initial temperature profile T0 in each 2D also provided good results for the test case with the largest rotation
cross section of the specimen. This was followed by the 3D recon- angle (45 ; see Fig. 20).
struction compositing all 2D estimations. For the end-to-end The results of the hybrid approach can be seen in Figs. 21–24.
approach, the measurement data were rescaled to obtain a discrete As with the end-to-end approach, thresholding was necessary to
Fourier number of 0:45, which enabled us to apply the generated eliminate some artifacts from the figures. Overall, the results seem
training data. to be similar to those from the end-to-end approach, and both steel
As in Secs. II–IV, we chose the reconstruction processes that rods appear as separate defects. However, the volume of the steel
achieved the lowest MSE, that is, the hybrid and end-to-end rods seems to be closer to the estimated ground truth. Note that
approaches with the large architectures lrg and e2e lrg, and applied the results for the specimen with the greatest rotation (45 ) show

FIG. 12. Real-measurement data. (a) A frame recorded at the top of the specimen; (b) temperature evolution over time measured along the green dashed line in (a); (c)
log spectra of the temperature data in (b), where the frequencies along the red lines were filtered out.

J. Appl. Phys. 128, 155103 (2020); doi: 10.1063/5.0020404 128, 155103-10


© Author(s) 2020
Journal of ARTICLE scitation.org/journal/jap
Applied Physics

FIG. 13. Using the model-based fkmig approach for 3D reconstruction of the specimen without rotation.

signs of deterioration, and thresholding did not help to improve A. Validity of the model
the quality of the reconstruction. We discuss these limitations and Although the first stage of the reconstruction process (i.e., the
the validity of the proposed deep learning approaches in Sec. VI. virtual wave method) is valid in all dimensions, the second stage is
applicable in 2D only. In fact, the hybrid deep learning approach
VI. DISCUSSION was trained on synthetic 2D data assuming adiabatic boundary
Both deep-learning-based methods outperformed the model- conditions. This means that the proposed real-world example
based baseline substantially on synthetic data. They also seem to worked well until the 2D slices were perpendicular or nearly per-
perform very well on real-world data, even though no real-world pendicular to the orientation of the steel rods. The deviation from
measurements were used in the training process. The networks suc- the perpendicular orientation should not be greater than the angle
cessfully generalized the learned reconstruction algorithm to real- between the steel rods and the measurement surface, which was
world data even under previously unseen conditions (rotated speci- about 10 . Note that the trained u-net also successfully generalized
men). Compared to the model-based fkmig approach, the results the results to rotated specimens. Interestingly, the reconstruction
are more consistent with the estimated ground truth, and thresh- was very accurate even for a 25 rotation.
olding was effortless. Due to its ill-posed nature, various noise assumptions must be
Which method is preferable depends on the particular use fulfilled in order to provide feasible solutions to the thermal recon-
case. If computational complexity must be kept as low as possible, struction problem.42 In the case of NDT applications, the temperature
the end-to-end approach is likely to be the better choice, although change is small for short integration time, and, thus, the noise is con-
the results might not reflect the actual size and/or shape of a sidered to be additive white Gaussian (AWGN), which otherwise
defect. If the quality of the results is more important than low com- follows a Poisson distribution.41 In this work, we assumed AWGN
plexity, the hybrid approach seems to be the better choice. The with a range of variances such that the training set SNRs matched
reason for the results of the hybrid approach deteriorating for the real-world experiments. In order to check this assumption, we esti-
45 specimen is given in Sec. VI A. mated the SNRs and the peak signal-to-noise-ratios (PSNRs) of each

FIG. 14. Using the model-based fkmig approach for 3D reconstruction of the specimen with a rotation of 10 .

J. Appl. Phys. 128, 155103 (2020); doi: 10.1063/5.0020404 128, 155103-11


© Author(s) 2020
Journal of ARTICLE scitation.org/journal/jap
Applied Physics

FIG. 15. Using the model-based fkmig approach for 3D reconstruction of the specimen with a rotation of 25 .

FIG. 16. Using the model-based fkmig approach for 3D reconstruction of the specimen with a rotation of 45 .

FIG. 17. Using the large end-to-end e2e lrg approach for 3D reconstruction of the specimen without rotation.

FIG. 18. Using the large end-to-end e2e lrg approach for 3D reconstruction of the specimen with a rotation of 10 .

J. Appl. Phys. 128, 155103 (2020); doi: 10.1063/5.0020404 128, 155103-12


© Author(s) 2020
Journal of ARTICLE scitation.org/journal/jap
Applied Physics

FIG. 19. Using the large end-to-end e2e lrg approach for 3D reconstruction of the specimen with a rotation of 25 .

FIG. 20. Using the large end-to-end e2e lrg approach for 3D reconstruction of the specimen with a rotation of 45 .

FIG. 21. Using the large hybrid lrg approach for 3D reconstruction of the specimen without rotation.

FIG. 22. Using the large hybrid lrg approach for 3D reconstruction of the specimen with a rotation of 10 .

J. Appl. Phys. 128, 155103 (2020); doi: 10.1063/5.0020404 128, 155103-13


© Author(s) 2020
Journal of ARTICLE scitation.org/journal/jap
Applied Physics

FIG. 23. Using the large hybrid lrg approach for 3D reconstruction of the specimen with a rotation of 25 .

FIG. 24. Using the large hybrid lrg approach for 3D reconstruction of the specimen with a rotation of 45 .

cross section d in our test specimen [cf. Fig. 12(b)] as follows:


   
kdk max d
SNR ¼ 20  log10 pffiffiffiffi 2 , PSNR ¼ 20  log10 ,
N ϵ ϵ

where N denotes the overall number of elements in d, and


ϵ ¼ 0:025 K is the noise-equivalent differential temperature
(NEDT) of our infrared camera. Figure 25 shows the estimated SNR
and PSNR values for each vertical cross-section image of the mea-
surement data. The range of SNRs (20 dB to 70 dB) of the training
set covered the estimated SNRs of the measurement data.

B. Computational complexity
From a computational point of view, training deep neural net-
works is an expensive task. However, once training is completed
and the weights are fixed, inference from unseen data is computa-
tionally cheap in comparison. Inspired by Sovrasov’s work,43 we
computed the number of multiply-accumulate operations (MACs) FIG. 25. Estimated SNR and PSNR values of the measurement frames.
in our u-net architectures, and the result was that e2e cmp required

J. Appl. Phys. 128, 155103 (2020); doi: 10.1063/5.0020404 128, 155103-14


© Author(s) 2020
Journal of ARTICLE scitation.org/journal/jap
Applied Physics

TABLE I. Computational complexity of the proposed algorithms for processing a Further, we found that the virtual wave concept serves as an
single input image. efficient feature extraction technique by which prior knowledge,
such as non-negativity and sparsity, can be incorporated into the
Method cmp lrg fkmig vwave extraction training process to improve the generalization properties of the
Input 256 × 64 256 × 64 256 × 64 256 × 2000 neural network solutions and to reduce the size of the training set.
GMACa 0.4 0.76 0.0077 65.5 In our simulations, the amplitude along the objects/defects was
considered to be constant, mimicking, for instance, a metallic wire
a
Estimated number of overall giga multiply-accumulate operations. with constant cross section inside insulating mass. In accordance
with this approach, we designed a real-world experiment that fits
well with the assumptions we made for the training set. In fact, the
0:4 GMAC, while e2e lrg needed 0:76 GMAC to process one input
amplitude along the steel rods did not change, the infrared camera
image of size 256  64. In the case of the hybrid- and the model-
we used fulfilled the AWGN assumption,42 and the SNR range of
based approaches, the computational complexity of the virtual wave
the measurement was covered by the training set. The applicability
extraction, which was 65:5 GMAC, should be added to the overall
of the proposed deep learning approaches can be extended by aug-
number of MACs. Note that the virtual wave extraction works with
menting the training set with additional data and assuming weaker
raw data, which explains the high computational load compared to
constraints, such as inhomogeneous defects and non-Gaussian
the other three algorithms in Table I. This first stage can be sped
noise. This will be part of our future work.
up by processing the measurement frames in parallel.
In order to give an impression of the reconstruction speed, we
measured the execution time on an Intel(R) Core(TM) i9-9900K at AUTHORS’ CONTRIBUTIONS
3.60 GHz CPU system equipped with a NVIDIA GeForce GTX TITAN
P. Kovács and B. Lehner contributed equally to this work and
X GPU. According to our experiments, the 3D reconstruction of the
both should be considered first authors of this manuscript.
real-world test specimen took approximately 2 s for e2e lrg, while it was
around 800 s for the large hybrid and the model-based fkmig
approaches including the virtual wave extraction. Note that in our setup, ACKNOWLEDGMENTS
the difference in inference time between the compact and the large
u-net model was negligible due to our GPU’s ability to parallelize This work was supported by Silicon Austria Labs (SAL),
matrix multiplications. However, this will be more pronounced in a owned by the Republic of Austria, the Styrian Business Promotion
setting where no GPU is available, for instance, on an embedded device. Agency (SFG), the federal state of Carinthia, the Upper Austrian
Research (UAR), and the Austrian Association for the Electric and
Electronics Industry (FEEI); and by the COMET-K2 “Center for
VII. CONCLUSIONS
Symbiotic Mechatronics” of the Linz Center of Mechatronics
We have proposed an end-to-end and a hybrid deep learning (LCM), funded by the Austrian Federal Government and the
approach for thermographic image reconstruction. The latter uses Federal State of Upper Austria.
the recently developed virtual wave concept,9 which proved to be Financial support by the Austrian Federal Ministry for Digital
an efficient feature extraction technique for our hybrid deep learn- and Economic Affairs, the National Foundation for Research,
ing approach.17 We studied each step of the reconstruction chain Technology and Development, and the Christian Doppler Research
by means of quantitative and qualitative tests. In doing so, we Association is gratefully acknowledged. Financial support was also
developed a framework44 for generating data samples, which can be provided by the Austrian Research Funding Association (FFG)
used to train, validate, and test machine learning algorithms for within the scope of the COMET programme within the research
thermographic imaging. In addition to these synthetic examples, project “Photonic Sensing for Smarter Processes (PSSP)” (Contract
we made a physical phantom from epoxy and used it for experi- No. 871974). This programme is promoted by BMK, BMDW, the
ments on real-world data.45 federal state of Upper Austria, and the federal state of Styria, repre-
Our experiments showed that, in terms of MSE, the hybrid sented by SFG.
method performs better on synthetic data than the model-based Additionally, parts of this work were supported by the
and the end-to-end approaches. Both our deep learning methods Austrian Science Fund (FWF) (Project Nos. P 30747-N32 and P
performed well on real-world data compared to their model-based 33019-N).
counterparts. The hybrid approach produced only a few artifacts
and achieved the best reconstruction for test cases in which the 2D
model was valid (i.e., with 0 and 10 of rotation), while the DATA AVAILABILITY
end-to-end method gave meaningful results on the real-world mea- The data and code that support the findings of this study are
surement data up to the highest rotation angle (45 ). Overall, we available at Ref. 46.
conclude that the proposed hybrid method outperforms the model-
based and the end-to-end approaches in terms of reconstruction
error, while the last might perform better in online applications, REFERENCES
where execution speed is crucial. For a first approximation, the 1
C. Antolis and N. Rajic, “Optical lock-in thermography for structural health
end-to-end approach provides a fast reconstruction, which can be monitoring—A study into infrared detector performance,” Procedia Eng. 188,
replaced by the hybrid method if the results are not satisfactory. 471–478 (2017).

J. Appl. Phys. 128, 155103 (2020); doi: 10.1063/5.0020404 128, 155103-15


© Author(s) 2020
Journal of ARTICLE scitation.org/journal/jap
Applied Physics

2
O. Lang, P. Kovács, C. Motz, M. Huemer, T. Berer, and P. Burgholzer, “A linear 25
F. Lingvall, T. Olofsson, and S. T., “Synthetic aperture imaging using sources
state space model for photoacoustic imaging in an acoustic attenuating media,” with finite aperture: Deconvolution of the spatial impulse response,” J. Acoust.
Inverse Probl. 35(1), 1–29 (2018). Soc. Am. 114, 225–234 (2003).
3
G. Mayr, G. Stockner, H. Plasser, G. Hendorfer, and P. Burgholzer, “Parameter 26
J. F. Claerbout, Earth Soundings Analysis: Processing versus Inversion
estimation from pulsed thermography data using the virtual wave concept,” (Blackwell Scientific Publications, Cambridge Center, Cambridge, MA, 2004).
NDT E Int. 100, 101–107 (2018). 27
S. Gazzola, P. C. Hansen, and J. G. Nagy, “IR tools: A MATLAB package of
4
K. Sreekumar and A. Mandelis, “Ultra-deep bone diagnostics with fat-skin overlayers iterative regularization methods and large-scale test problems,” Numer.
using new pulsed photothermal radar,” Int. J. Thermophys. 34, 1481–1488 (2013). Algorithms 81, 1–39 (2018).
5
P. T. Tavakolian and A. Mandelis, “Perspective: Principles and specifications of 28
M. Yuan and Y. Lin, “Model selection and estimation in regression with
photothermal imaging methodologies and their applications to non-invasive bio- grouped variables,” J. R. Stat. Soc. Ser. B 68, 49–67 (2006).
medical and non-destructive materials imaging,” J. Appl. Phys. 124, 160903 29
E. van den Berg and M. P. Friedlander, “Sparse optimization with least-squares
(2018). constraints,” SIAM J. Optim. 21, 1201–1229 (2011).
6
N. Verdel, J. Tanevski, S. Džeroski, and B. Majaron, “Predictive model for the 30
Y. Zhang, J. Yang, and W. Yin, “Alternating direction algorithms for l1-problems
quantitative analysis of human skin using photothermal radiometry and diffuse in compressive sensing,” SIAM J. Sci. Comput. 33, 1873–1896 (2011).
reflectance spectroscopy,” Biomed. Opt. Express 11, 1679–1696 (2020). 31
E. van den Berg and M. P. Friedlander, “Probing the Pareto frontier for basis
7
A. Mendioroz, K. Martínez, R. Celorrio, and A. Salazar, “Characterizing the pursuit solutions,” SIAM J. Sci. Comput. 31, 890–912 (2008).
shape and heat production of open vertical cracks in burst vibrothermography 32
E. van den Berg and M. P. Friedlander, “SPGL1: A solver for large-scale sparse
experiments,” NDT E Int. 102, 234–243 (2019). reconstruction” (2019), see https://friedlander.io/spgl1.
8
M.-M. Groz, E. Abisset-Chavanne, A. Meziane, A. Sommier, and C. Pradère, 33
J. Jvanvugt, “Tunable u-net implementation in pytorch” (2019), see https://
“Three-dimensional reconstruction of thermal volumetric sources from surface github.com/jvanvugt/pytorch-unet.
temperature fields measured by infrared thermography,” Appl. Sci. 9, 5464 (2019). 34
O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks
9
P. Burgholzer, M. Thor, J. Gruber, and G. Mayr, “Three-dimensional thermo- for biomedical image segmentation,” in International Conference on
graphic imaging using a virtual wave concept,” J. Appl. Phys. 121, 105102 (2017). Medical Image Computing and Computer-Assisted Intervention (Springer, 2015),
10
G. Thummerer, G. Mayr, M. Haltmeier, and P. Burgholzer, “Photoacoustic pp. 234–241.
reconstruction from photothermal measurements including prior information,” 35
D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” in
Photoacoustics 19, 100175 (2020). Proceedings of the 3rd International Conference for Learning Representations
11
G. Thummerer, G. Mayr, P. D. Hirsch, M. Ziegler, and P. Burgholzer, (ICLR) (OpenReview.net, 2015), pp. 1–15.
“Photothermal image reconstruction in opaque media with virtual wave backpro- 36
B. Lehner and T. Gallien, “Uncertainty estimation for non-destructive detec-
pagation,” NDT E Int. 112, 102239 (2020). tion of material defects with u-nets,” in Proceedings of the 2nd International
12
L. J. Busse, “Three-dimensional imaging using a frequency-domain synthetic Conference on Advances in Signal Processing and Artificial Intelligence (ASPAI)
aperture focusing technique,” IEEE Trans. Ultrason. Ferroelectr. Freq. Control. (IFSA Publishing, 2020).
39, 174–179 (1992). 37
I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harnessing adversar-
13
D. Lévesque, A. Blouin, C. Néron, and J.-P. Monchalin, “Performance of laser- ial examples,” in Proceedings of the 3rd International Conference on Learning
ultrasonic F-SAFT imaging,” Ultrasonics 40, 1057–1063 (2002). Representations (ICLR), edited by Y. Bengio and Y. LeCun (OpenReview.net, 2015).
14
P. Burgholzer, G. Stockner, and G. Mayr, “Acoustic reconstruction for photo- 38
M. T. Ribeiro, S. Singh, and C. Guestrin, “Why should I trust you?: Explaining
thermal imaging,” Bioengineering 5, 1–9 (2018). the predictions of any classifier,” in Proceedings of the 22nd ACM SIGKDD
15
P. Burgholzer, “Thermodynamic limits of spatial resolution in active thermog- International Conference on Knowledge Discovery and Data Mining (ACM,
raphy,” Int. J. Thermophys. 36, 2328–2341 (2015). 2016), pp. 1135–1144.
16
A. Jonas and O. Ozan, “Solving ill-posed inverse problems using iterative deep 39
B. Lehner, J. Schluter, and G. Widmer, “Online, loudness-invariant vocal
neural networks,” Inverse Probl. 33, 124007 (2017). detection in mixed music signals,” IEEE/ACM Trans. Audio Speech Lang.
17
P. Kovács, B. Lehner, G. Thummerer, M. Günther, P. Burgholzer, and Process. 26, 1369–1380 (2018).
M. Huemer, “A hybrid approach for thermographic imaging with deep learn- 40
B. L. Sturm, “A simple method to determine if a music information
ing,” in Proceedings of the 45th IEEE International Conference on Acoustics, retrieval system is a ‘horse’,” IEEE Trans. Multimedia 16, 1636–1644
Speech and Signal Processing (ICASSP) (IEEE, 2020), pp. 4277–4281. (2014).
18
P. C. Hansen, Rank-Deficient and Discrete Ill-Posed Inverse Problems: 41
S. Breitwieser, G. Zauner, and G. Mayr, “Characterization of mid-wavelength
Numerical Aspects of Linear Inversion (SIAM Monographs on Mathematical quantum infrared cameras using the photon transfer technique,” Infrared Phys.
Modeling and Computation, Philadelphia, PA, 1998). Technol. 106, 103283 (2020).
19
C. W. Groetsch, “Integral equations of the first kind, inverse problems and reg- 42
M. N. Özisik and H. Orlande, Inverse Heat Transfer: Fundamentals and
ularization: A crash course,” J. Phys. Conf. Ser. 73, 012001 (2007). Applications (Taylor & Francis, New York, NY, 1999).
20
F. J. Herrmann and G. Hennenfent, “Non-parametric seismic data recovery 43
V. Sovrasov, “Flops counter for convolutional networks in pytorch framework”
with curvelet frames,” Geophys. J. Int. 173, 233–248 (2008). (2020), see https://github.com/sovrasov/flops-counter.pytorch.
21
E. Candés, L. Demanet, D. Donoho, and L. Ying, “Fast discrete curvelet trans- 44
P. Kovács, B. Lehner, G. Thummerer, G. Mayr, P. Burgholzer, and
forms,” Multiscale Model. Simul. 5, 861–899 (2006). M. Huemer, “A hybrid approach for thermographic imaging with deep learning”
22
J. Eckstein and P. D. Bertsekas, “On the Douglas–Rachford splitting method (2019), see https://codeocean.com.
and the proximal point algorithm for maximal monotone operators,” Math. 45
N. Verdel, J. Tanevski, S. Džeroski, and B. Majaron, “A machine-learning
Program. 55, 293–318 (1992). model for quantitative characterization of human skin using photothermal radi-
23
P. C. Hansen, “Analysis of discrete ill-posed problems by means of the ometry and diffuse reflectance spectroscopy,” in Proceedings of SPIE 10851,
L-curve,” SIAM Rev. 34, 561–580 (1992). Photonics in Dermatology and Plastic Surgery (SPIE, 2019).
24 46
D. Garcia, L. L. Tarnec, S. Muth, E. Montagnon, J. Porée, and G. Cloutier, B. Lehner, ThermUnet - Deep Learning Approaches for Thermographic
“Stolt’s f-k migration for plane wave ultrasound imaging,” IEEE Trans. Ultrason. Imaging (Online, 2020); available at https://git.silicon-austria.com/pub/confine/
Ferroelectr. Freq. Control 60, 1853–1867 (2013). ThermUNet

J. Appl. Phys. 128, 155103 (2020); doi: 10.1063/5.0020404 128, 155103-16


© Author(s) 2020

You might also like