Professional Documents
Culture Documents
1 Introduction while others capture melt pool thermal images during manufactur-
ing and correlate the thermal images with defects for in situ
Laser-based additive manufacturing (LBAM) produces metal
process monitoring [10–15].
parts from bottom to up through layer-wise cladding, which pro-
On the physics-based simulation of the melt pool, finite element
vides unprecedented possibilities to produce complicated parts
model (FEM) has been widely used for modeling the melt pool
with multiple functions for a wide range of engineering applications
geometry and thermal distribution. For example, Romano et al.
[1]. Powder bed fusion (PBF) and direct energy deposition (DED)
established an FEM for the AM process to simulate the melt pool
are two main methods in LBAM. PBF fuses the powder preplaced
thermal distribution and geometry in powder bed and compared
layer by layer selectively using the laser [2]. In DED, the material is
these characteristics between different powder materials [6]. Song
infused into a melt pool which is used to fill in the cross-section of
et al. presented a combination of FEM and experimental measure-
the part and produce the part layer by layer. This process creates
ments to analyze the impact of scanning velocity or laser power
large thermal gradients, leading to residual stress and plastic defor-
on the melt pool size during the selective laser melting (SLM)
mation [3], which may affect the surface integrity, microstructure,
process [7]. Zhuang et al. developed an FEM to model the
or mechanical properties of the products.
change of melt pool dimensions and temperature field of Ti–6Al–
Since the shape and size of the melt pool are very important to
4V powder during SLM [8]. Tian et al. established an FEM for
determining the microstructure of additive manufacturing (AM)
fluid flow and heat transfer with different parameters to study the
parts [4], melt pool data have been treated as one of the most impor-
impact of different parameters on the dilution ratio of the
tant signatures for monitoring and predicting defects [5]. Some
Ni-based alloy [9]. All these methods, however, are limited to
researchers develop finite element models to simulate the melt
melt pool simulation, which cannot be used for in situ process mon-
pool for parameter selection in the process planning stage [6–9],
itoring or defect prediction due to the randomness in the actual man-
ufacturing process but beyond the capability of FEMs.
1
Corresponding author.
In recent years, the developments in LBAM and sensors enable
Manuscript received June 4, 2020; final manuscript received October 17, 2020; researchers to capture melt pool thermal behavior during the manu-
published online December 17, 2020. Assoc. Editor: Qiang Huang. facturing process. High-speed thermal images of the melt pool
Journal of Manufacturing Science and Engineering APRIL 2021, Vol. 143 / 041011-1
Copyright © 2020 by ASME
captured during LBAM make it possible for in situ melt pool mon- never been brought into the picture. Smart data fusion of these
itoring and anomaly detection. Khanzedeh et al. used tensor decom- two sources of data will allow to significantly enhance the predic-
position to extract features from thermal image streams to monitor tion accuracy of internal porosity.
metal-based AM [10]. Mahmoudi et al. used the melt pool thermal The remainder of the paper is arranged as follows. Section 2
images as the input signal to detect layer-wise defects in PBF pro- explains the pyrometer and IR data, including the data collection
cesses [11]. Seifi et al. built a novel model to study the relationship procedure, pre-processing, porosity labeling, and augmenting the
between thermal images and defects [12]. imbalanced dataset. Section 3 presents the proposed decision-level
A major anomaly or defect with LBAM is porosity. The lack of data fusion framework for porosity prediction, including a PyroNet
knowledge about the underlying process–porosity relationship, for pyrometer images and an IRNet for IR image sequences, and a
along with the pressing need for efficient and accurate porosity pre- fusion method to combine the results from PyroNet and IRNet. The
Fig. 2 First, middle, and last scans on layer 29 for (a)–(c) pyrometer data and (d)–(f ) infrared camera data
Journal of Manufacturing Science and Engineering APRIL 2021, Vol. 143 / 041011-3
temporal dependence. This is because the scanning direction of the Data augmentation is a necessary processing step before the
IR camera is about 45 deg to the melt pool, making IR images very actual model training. This is because the training data size, 700,
different from each other in the same layer, as shown in Fig. 2. are small for building a deep learning model. Meanwhile, the imbal-
The temporal correlations in IR images are the major source of pre- ance between “good” and “bad” samples, i.e., 645 versus 55, is
diction power for porosity, thus cannot be neglected. Therefore, it is likely to compromise the learning outcome and prediction
reasonable to treat IR images as image sequences rather than indi- accuracy—the model trained on such data would not fully learn
vidual images. the population of “bad” samples and tend to make false-negative
An image sequence contains several continuous-time images, just predictions. To resolve these concerns, we augment the “bad”
like the frames in a video. Having a long sequence can better pre- samples in training data using bootstrapping. Bootstrapping has
serve the long-term temporal correlations but poses challenges to been used in training data/feature augmentation to improve the
Fig. 3 (a) A pyrometer image with “good” label and (b) a pyrometer image with “bad” label
Fig. 5 VGG16 architecture for the PyroNet, adapted from Ref. [35]
Journal of Manufacturing Science and Engineering APRIL 2021, Vol. 143 / 041011-5
Downloaded from http://asmedigitalcollection.asme.org/manufacturingscience/article-pdf/143/4/041011/6604377/manu_143_4_041011.pdf by Nanyang Technological University user on 03 January 2021
Fig. 6 Hyper-parameters in the PyroNet
where * is the convolutional operator. The activation function of After passing the 5th block (Conv5_x), the features extracted for
these convolution layers is ReLU [36] (i.e., rectified linear unit), Hi form z(5) i = [(z )rcq ]i ∈ R
(5) 7×7×512
, which are flattened into a
which is used to remove the negative values, and follows a quite vector, zi ∈ R
(5) 1×25088
, and then fed to two FC layers (equipped
simple function y = (y)mnq ∈ RM×N×Q , ymnq = max (0, x′ mnq ). with ReLU activation function) with 4096 channels each
The stride of 1 and pad of 1 are used for all the convolutional
layers. The max-pooling layer with a kernel size of 2 then
25088
reduces the revolution of y and improves the robustness of fij(1) = max (1)
0, b + z(5) · w(1) ,
ik jk
learned features. In max-pooling, the R × C × Q output map, z = k=1 (4)
(z)rcq ∈ ℝR×C×Q, is achieved by computing the maximum values
over non-overlapping regions of the input map y with a 2 × 2 j = 1, 2, . . . , 4096
square filter
4096
zrcq = max (y(2r+m)(2c+n)q ) (3) fij(2) = max 0, b (2)
+ fik(1) · w(2)
jk ,
m, n∈{1,2}
k=1 (5)
j = 1, 2, . . . , 4096
q = 1, 2, …, Q. An illustration of the first max-pooling layer in
PyroNet is shown in Fig. 7. Through each block (Conv1_x,
Conv2_x, …, Conv5_x), the resolution of input becomes half of where w(l)
jk and b
(l )
are the weight and bias of FC layers, l = 1, 2. A
its original resolution (224, 112, 56, 28, and 14) due to max-pooling 1D feature vector f i ∈ R1×4096 is thus obtained. To avoid overfit-
(2)
layers, while the dimension doubles until 512 (64, 128, 256, 512, ting, a drop-out layer is added after each of these two FC layers,
and 512) as the number of filters for convolutional layers increase. whose rate is 0.5.
After that, another FC layer which uses a two-way softmax func- information through time-distributed layers first. After that, a
tion as the activation is to establish the relationship between flatten layer, a combination of FC layers, and drop-out layers will
i and binary porosity label (0 for “good” and
extracted features f (2) convert the parameters into a vector and decrease the number of
1 for “bad”) in PyroNet. The probability of a sample to be parameters. Finally, an LSTM layer will model the temporal infor-
“good,” p̂ pyro (i), can be calculated by mation and another FC layer is used to predict the classification
label. This FC-LSTM structure is illustrated in Fig. 8.
e fi0 The major disadvantage of FC-LSTM is the way to handle spa-
p̂ pyro (i) = ,
e fi1 + e fi0 tiotemporal data. Since it uses FC layers to finish transitions
(6) between input and state, as well as the ones between state and
4096
fij = bj + fik(2) · w jk , j = 0, 1 state, it is unable to encode spatial information [35]. Also, the
k=1 LSTM layer can only deal with one-dimensional vectors; thus,
the information has to be flattened to fit the input of the LSTM layer.
where p̂ pyro (i) is the predicted probability of a pyrometer image To cope with this disadvantage of FC-LSTM, our IRNet employs
showing neglectable porosity (diameter of <0.05 mm, predicted as the ConvLSTM2D layer in the Keras package to build up the LRCN
“good”). wjk and bj are the weight and bias of the final FC layers. model, as shown in Fig. 9. The ConvLSTM2D layer is similar to an
The predicted label of the pyrometer image will be “good” if LSTM layer, while both of the recurrent transformations and input
p̂ pyro (i) > 0.5. transformations are convolutional. That is to say, it can effectively
We use the Keras package to build up the PyroNet. The pyrom- deal with spatial information and temporal information simulta-
eter images are resized to 224 × 224 pixels (width × height) and then neously. The input of IRNet is IR sequences, each of which contains
fed into the PyroNet. To train the PyroNet, categorical cross- three resized 224 × 224 pixels (width × height) RGB IR images.
entropy is used as the loss function The input of IRNet is denoted as H̃ i , where i is the index of the IR
CE = −pi log p̂ pyro (i) − (1 − pi ) log (1 − p̂ pyro (i)) (7) sequence and i = 1, 2, …. Each of the sequences contains three
Journal of Manufacturing Science and Engineering APRIL 2021, Vol. 143 / 041011-7
Downloaded from http://asmedigitalcollection.asme.org/manufacturingscience/article-pdf/143/4/041011/6604377/manu_143_4_041011.pdf by Nanyang Technological University user on 03 January 2021
Fig. 9 Hyper-parameters in the IRNet
resized 224 × 224 pixels (width × height) RGB IR images, denoted the resolution of the data while offering better performance than
as χ1, χ2, and χ3 for each sequence. The input sequences will go using the pooling layer. Also, between two ConvLSTM2D layers
through three ConvLSTM2D blocks. Each ConvLSTM2D block in each part, a Batch Normalization layer is used to transfer the
contains two ConvLSTM2D layers [37] with the same kernel size mean of activation of the previous layer close to 0 and its standard
(set as 3), pad (set as 1), and return sequence (set as True to feed- deviation near 1. Other parameters of the ConvLSTM2D layers use
back all the information in each input sequence), whose equations their default values, such as the “tanh” activation, “hard_sigmoid”
are shown as follows: recurrent activation, and so on.
The rest of IRNet is quite similar to PyroNet: after going through
it = σ(Wxi ∗ χ t + Whi ∗ Ht−1 + Wci ◦ Ct−1 + bi ) the ConvLSTM2D blocks, the data will be flattened, and a combina-
ft = σ(Wxf ∗ χ t + Whf ∗ Ht−1 + Wcf ◦ Ct−1 + bf ) tion of FC layers and drop-out layers will decrease the number of
parameters and predict the possibility of IR sequences to be “good”
Ct = ft ◦ Ct−1 + it ◦ tanh(Wxc ∗ χ t + Whc ∗ Ht−1 + bc ) (8)
or “bad” by another FC layer using two-way softmax as the activation
ot = σ(Wxo ∗ χ t + Who ∗ Ht−1 + Wco ◦ Ct + bo ) function. Categorical cross-entropy is used as the loss function, SGD
Ht = ot ◦ tanh(Ct ) with Nesterov [38] momentum is used as the optimizer, and accuracy
is used as the performance metric to train the IRNet. Note that the
where χt is each image in the input sequence i, t = 1, 2, 3. it, ft, ot are learning rate in SGD is adjusted according to the accuracy of the vali-
input gate, forgot gate, output gate of the tth image in the sequence, dation part during model training to avoid local optimum. A well-
respectively. Ct is the cell output, and Ht is the hidden state. σ is the trained IRNet is then used to predict the probability to be “good”
logistic sigmoid function, “*” denotes the convolution operator, and for the IR sequences in the test set, p̂IR (i), and finally used for the
“◦” means the Hadamard product. The weight tensor subscripts data fusion framework to calculate p̂(i) according to Eq. (1).
have the obvious meaning; for example, Whi is the hidden-input
gate tensor and Wxo is the input–output gate tensor, etc. Bias is
also considered, which are denoted as b with certain subscripts.
Through these blocks, the resolution of the images in input 4 Results and Analysis
sequences is halved twice, from 224 to 56, while the dimension is Since 6-fold CV is performed to help prevent overfitting, we first
doubled twice, from 64 to 256, since we set a larger and larger fil- present the detailed results and analysis for one of the folds (called
tering parameter for each block. The stride in the second Fold 1) in Secs. 4.1 to 4.3. Specifically, we will analyze the perfor-
ConvLSTM2D layers in three blocks is set as 2, which is different mance of PyroNet and compare them with existing studies in
from the one in the first ConvLSTM2D layer in that block. By doing Sec. 4.1. The performance of IRNet will be presented in Sec. 4.2.
so, these ConvLSTM2D are equivalent to one ConvLSTM2D layer The fused results will be given in Sec. 4.3. Finally, the performance
with stride 1 and another pooling layer. This arrangement decreases of other folds (Fold 2 to Fold 6) will be listed in Sec. 4.4.
4.1 PyroNet Results. The PyroNet is trained by a training set, Table 1 Confusion matrix of the PyroNet
including 538 “good” and 538 “bad” pyrometer images to train the
model and 107 “good” and 107 “bad” pyrometer images to validate Predicted class
the model. The batch size is 96, and the epoch number is 100.
The loss and accuracy during the training process of PyroNet are Actual class Bad Good
presented in Fig. 10.
Bad 10 1
It can be seen from Fig. 10 that as the epoch number grows, both Good 0 129
the training loss and validation loss decrease to nearly 0, while both
the training accuracy and validation accuracy increase to almost 1,
indicating that our PyroNet is well trained by the pyrometer dataset.
After training the PyroNet, we can use it to predict the porosity
label for the test set (including 129 “good” and 11 “bad” pyrometer Table 2 Performance comparison with existing methods
images). The confusion matrix of PyroNet is presented in Table 1.
In this study, we treat “bad” as positive and “good” as negative. False-
Thus, True Positive (TP), False Positive (FP), True Negative (TN), Accuracy Recall Precision positive
and False Negative (FN) are 10, 0, 129, 1, respectively. Hence, per- Method (%) (%) (%) rate (%)
formance indicators such as accuracy, recall, and precision can be
determined by T 2 chart with tensor 90.74 95.24 61.86 10.0
decomposition by
TP + TN Khanzadeh et al. [10]
Accuracy = × 100% (9) Q chart with tensor 88.19 25.40 80.00 1.08
TP + FP + TN + FN decomposition by
Khanzadeh et al. [10]
TP
Recall = × 100% (10) K-Nearest Neighbor NA 98.44 NA 0.19
TP + FN (KNN) by Khanzadeh
et al. [14]
PyroNet 99.29 90.90 100 0
TP
Precision = × 100% (11)
TP + FP Note: NA in the table means unavailable.
Journal of Manufacturing Science and Engineering APRIL 2021, Vol. 143 / 041011-9
Downloaded from http://asmedigitalcollection.asme.org/manufacturingscience/article-pdf/143/4/041011/6604377/manu_143_4_041011.pdf by Nanyang Technological University user on 03 January 2021
Fig. 11 Loss and accuracy during the training process of the IRNet
Table 3 Confusion matrix of the IRNet Table 4 Estimated probability that the melt pool is “good”
Fig. 12 Loss and accuracy during the training process of the PyroNet for Fold 2 to Fold 6: (a) fold 2, (b)
fold 3, (c) fold 4, (d) fold 5, and (e) Fold 6
Journal of Manufacturing Science and Engineering APRIL 2021, Vol. 143 / 041011-11
Downloaded from http://asmedigitalcollection.asme.org/manufacturingscience/article-pdf/143/4/041011/6604377/manu_143_4_041011.pdf by Nanyang Technological University user on 03 January 2021
Fig. 13 Loss and accuracy during the training process of the IRNet for Fold 2 to Fold 6: (a) fold 2, (b) fold 3, (c) fold 4, (d) fold 5, and
(e) fold 6
4.2 IRNet Results. The IRNet is trained by a training set, 4.3 Data Fusion Results and Analysis. By setting the weight-
including 1076 IR sequences with half “good” and half “bad” to ing factor as 0.6, we combine the PyroNet predicted probability of a
train the model, and 214 IR sequences with half “good” and half sample being “good” p̂ pyro and the IRNet predicted probability p̂IR
“bad” to validate the model. The batch size is 96, and the epoch to calculate the p̂ value for each sample in the test set. Most samples
number is 100. The loss and accuracy during the training process have a fused probability of 0 or 1, indicating high confidence in the
of IRNet are presented in Fig. 11. prediction. The results of selected samples that have p̂ between 0
From Fig. 11, it is clear that after some fluctuations, the and 1 are shown in Table 4.
training process finally converges, which means that the From Table 4, it can be seen that the PyroNet prediction and
IRNet is well trained. Next, the well-trained IRNet is used to IRNet prediction are not consistent for samples 4, 21, 41, 47, 62,
predict porosity label for IR sequences in the test set with 129 91, and 122. For most of them, such as samples 21, 41, 47, 62,
“good” and 11 “bad.” The confusion matrix of IRNet is presented 91, and 122, PyroNet prediction is accurate compared with the
in Table 3. true label. However, for sample 4, it is misclassified by the
From Table 3, it can be seen that the overall performance PyroNet, whose true label is “bad” while the predicted probability
of the IRNet is not as good as the PyroNet. Thus, we assign to be “good” is p̂ pyro (4) = 0.74. On the other hand, the IRNet of
a relatively higher weight to PyroNet results, believing the the 4th sample gives a very low predicted probability to be
PyroNet to be more effective than the IRNet. A factor of w = 0.6 “good,” p̂IR (4) = 0.04. It gives the fused predicted probability
is suggested for the decision-level data fusion for porosity p̂(4) = 0.46, which is less than 0.5, and so, the 4th sample will be
prediction. predicted as “bad.” The predicted labels of other samples after
Journal of Manufacturing Science and Engineering APRIL 2021, Vol. 143 / 041011-13
[2] Yan, Z., Liu, W., Tang, Z., Liu, X., Zhang, N., Li, M., and Zhang, H., 2018, [20] Hinton, G., Deng, L., Yu, D., Dahl, G. E., Mohamed, A.-r., Jaitly, N., Senior, A.,
“Review on Thermal Analysis in Laser-Based Additive Manufacturing,” Opt. Vanhoucke, V., Nguyen, P., Sainath, T., and Kingsbury, B., 2012, “Deep Neural
Laser Technol., 106, pp. 427–441. Networks for Acoustic Modeling in Speech Recognition: The Shared Views of
[3] Heigel, J. C., Michaleris, P., and Reutzel, E. W., 2015, “Thermo-Mechanical Four Research Groups,” IEEE Signal Process. Mag., 29(6), pp. 82–97.
Model Development and Validation of Directed Energy Deposition Additive [21] Krizhevsky, A., Sutskever, I., and Hinton, G. E., 2017, “Imagenet
Manufacturing of Ti–6al–4v,” Addit. Manuf., 5, pp. 9–19. Classification With Deep Convolutional Neural Networks,” Commun. ACM,
[4] Guo, Q., Zhao, C., Qu, M., Xiong, L., Escano, L. I., Mohammad, S., Hojjatzadeh, 60(6), pp. 84–90.
H., Parab, N. D., Fezzaa, K, Everhart, W., Sun, T., and Chen, L., 2019, “In-Situ [22] Weimer, D., Scholz-Reiter, B., and Shpitalni, M., 2016, “Design of Deep
Characterization and Quantification of Melt Pool Variation Under Constant Input Convolutional Neural Network Architectures for Automated Feature Extraction
Energy Density in Laser Powder Bed Fusion Additive Manufacturing Process,” in Industrial Inspection,” CIRP Ann., 65(1), pp. 417–420.
Addit. Manuf., 28, pp. 600–609. [23] Wang, P., Yan, R., and Gao, R. X., 2017, “Virtualization and Deep Recognition
[5] Jafari-Marandi, R., Khanzadeh, M., Tian, W., Smith, B., and Bian, L., 2019, for System Fault Classification,” J. Manuf. Syst., 44(2), pp. 310–316.