You are on page 1of 12

Welding in the World (2022) 66:2509–2520

https://doi.org/10.1007/s40194-022-01373-7

RESEARCH PAPER

Weld penetration identification with deep learning method based


on auditory spectrum images of arc sounds
Yanfeng Gao1,2   · Qisheng Wang3,4 · Jianhua Xiao5 · Genliang Xiong1,2 · Hua Zhang1,2

Received: 27 April 2022 / Accepted: 26 August 2022 / Published online: 8 September 2022
© International Institute of Welding 2022

Abstract
Penetration states significantly affect the service performance of weld products. For improving welding quality, it is essential to
real-timely monitor the penetration states of molten pool during welding process. This study adopts arc sound signals to identify
penetration states of weld seam. Firstly, the time–frequency spectrum images of arc sounds are obtained with short-time Fourier
transform. And based on a convolution neural network, the penetration states of weld seam are identified. For improving the anti-
interference ability of the proposed identification method, a mathematical model that simulates the functions of human auditory
system is developed. The auditory spectrum images of arc sounds are acquired with this model. Based on the auditory spectrum
images of arc sounds, the penetration states are identified. The experimental results show that the proposed method has high
anti-interference ability. When the signal-to-noise ratio is less than 5 dB, the accuracy rate of identification keeps more than 95%.

Keywords  Weld penetration · Deep learning method · Arc sounds · Auditory spectrum · Convolution neural network

1 Introduction melts base metals. The molten metals form a liquid pool
and solidification subsequently. However, the forming of
Penetration states of welding pool significantly affect the weld pool is not a stable process. For getting a consistent
service performance of weld products. During welding pro- penetration state, it is necessary to monitor the penetration
cess, weld arc generates high-density thermal energy and states real-timely. However, the extreme high temperature
of the liquid pool makes it almost impossible to measure
Recommended for publication by Commission XII - Arc Welding penetration states online.
Processes and Production Systems Various methods including vision, infrared image, weld
pool oscillation, audio sensing, and fusion of above have
* Yanfeng Gao been studied. Xu et al. [1] addressed a vision technique and
gyf_2672@163.com
a faster R-CNN model to identify welding seam type for
* Qisheng Wang real-time sensing and tracking of weld seam. Yang et al.
wangqisheng@mail.ustc.edu.cn
[2] measured the surface temperature of molten pool with
1
School of Mechanical and Automotive Engineering, infrared thermography technique. Huang et al. [3] utilized
Shanghai University of Engineering Science, laser vision to measure the oscillation frequency of molten
Shanghai 201620, China pool and detect the penetration states. Song et al. [4] ana-
2
Shanghai Collaborative Innovation Center of Intelligent lyzed audio signals of variable polarity plasma arc welding
Manufacturing Robot Technology for Large Components, and identified penetration states based on it. Zhang et al.
Shanghai 201620, China
[5] through fusion of sound, voltage, and spectrum signals
3
Institute of Intelligent Machines, Hefei Institutes of Physical identify penetration states.
Science, Chinese Academy of Sciences, Huihong Building,
Changwu Middle Road 801, Changzhou 213164, Jiangsu, Compared with vision and infrared information, weld-
China ing arc sound signals are relatively easy to be collected. Lv
4
Department of Science Island, University of Science et al. [6] proposed an auditory attention method to extract
and Technology of China, Anhui 230026 Hefei, China features from arc sound signals, and built a back propagation
5
School of Chemistry and Chemical Engineering, Shanghai artificial neural network to identify penetration states. Zhang
University of Engineering Science, Shanghai 201620, China et al. [7] developed feature selection approaches to select

13
Vol.:(0123456789)
2510 Welding in the World (2022) 66:2509–2520

frequency components of arc sounds and identify penetra- cortex was studied subsequently by some research-
tion states of aluminum alloy weld beads. Gao et al. [8, 9] ers. Wang et al. [16] based on the neuron physiological
identified penetration states through modeling and analyzing mapping in auditory cortex developed a spectral shape
human welder subjective assessments on arc sounds. The analysis model. Martin et al. [17] proposed a correlation
results of these studies manifest that it is feasible to identify spectrogram model to recognize sound source. Patterson
the penetration states of weld bead with arc sound signals. et al. [18] constructed an auditory model to simulate the
However, arc sounds are easily influenced by background function of the auditory cortex. This model can convert
noises of welding. The correlation between arc sounds and sound signals into auditory images. These mathematical
penetration states is very weak. Therefore, it is crucial to models simulate the functions of human auditory system
improve the anti-interference capability of penetration iden- from different aspects, and provide theoretical founda-
tification methods. tions for practical applications.
For complicated signals or images, deep learning method In this study, the time–frequency spectrum images of
has higher recognition accuracy than human eye. Deep arc sounds are acquired firstly, and then, a convolution
learning method has been extensively applied in image neural network is built. Based on the time–frequency
recognition, natural language processing, machine faults spectrum images of arc sound, the penetration states of
diagnosis, etc. Recently, some researches adopted deep weld seam are identified. For improving the anti-inter-
learning method for weld signal processing and penetration ference performance of identification method, a model
identification. Li et al. [10] proposed a penetration predic- that simulates the function of human auditory system
tion model based on convolution neural network. This model is designed and the auditory spectrum images of arc
predicts the penetration states of weld seam from molten sounds are obtained. Finally, based on auditory spectrum
pool images. Ren et al. [11] utilized convolution neural net- images, the penetration states are identified in strong noisy
work to extract the time–frequency image features of arc environment.
sound signals for penetration identification. Experimental
results show that the proposed convolution neural network
model achieved an excellent recognition performance. Wu
et al. [12] collected molten pool images and sound signals 2 The weld penetration states
in plasma welding, and used convolution neural network to and experimental system
extract features and predict weld penetrations. These stud-
ies mainly focus on the construction of convolution neural 2.1 The weld bead penetration states
network and the training process of it. The original images
taken as data set of the neural network come from weld pool As shown in Fig. 1a, in the welding process, weld arc is
photos or arc sound time–frequency spectrum images. In formed between tungsten electrode and base metal. The
welding process, there are many background noises, which heats generated by weld arc melt base metal and form a weld
will significantly affect the weld pool photos and arc sound pool. Weld arc consists of fast-flowing plasma jet. When the
signals. Therefore, it is necessary to process the original plasma jet hits the surface of weld pool, arc sounds will be
signals or images firstly with specific method. generated. Different penetration states have specific oscil-
Monitoring penetration states with sound signals under lation frequency, which will affect the characteristics of arc
noisy industrial environments is extremely difficult. sounds. Therefore, based on arc sound features the penetra-
However, daily experiences tell us that human auditory tion states of weld pool could be detected. Although the
system has excellent anti-interference ability. Even in a penetration state of weld pool is a continual process that
workshop environment, a skilled welder can adjust the changes with the increasing of molten depth of base metal,
welding parameters based on arc sounds to keep weld the penetration states are usually classified into three cat-
quality. Building mathematical model of human auditory egories: non-penetration, full-penetration, and excessive-
system to process sound signals has been tried by many penetration. The different penetration states of welding bead
researchers. Shamma et al. [13] discovered the inhibitory are shown in Fig. 1b.
effect of cochlear neurons, and built a lateral inhibition Penetration states significantly affect the weld quality,
network model. Patterson et al. [14] proposed a Gamma- so detecting the penetration states of weld pool real-timely
tone auditory filter model to simulate the cochlear func- is crucial. However, welding process is accompanied with
tions. Hewitt et al. [15] studied the functions of hair cells extremely high temperature and strong arc light. Therefore,
in cochlear and built a mathematical model to simulate it. it is intricate to detect the penetration states real-timely. Arc
The validity of the built model was analyzed with physi- sounds contain abundance information of weld process,
ological experiments. The above researches concentrate so they are the potential original signals for penetration
on the functions of cochlear. The function of auditory detection.

13
Welding in the World (2022) 66:2509–2520 2511

Tungsten
electrode Weld
Non-penetration
Shield gas pool

Plasma jet Weld arc


Weld Weld pool Weld Full-penetration
power pool

Weld
Excessive-penetration
Base metal pool

(a) Welding process (b) Weld penetration states

Fig. 1  Welding process and penetration states. a Welding process. b Weld penetration states

2.2 The experimental system experiments were implemented in this study, and each group
was repeated 3 times. After welding, the cross-sections of
The experimental system is shown in Fig. 2. The welding the weld seam were observed and their penetration states
power source is a TIG300S DC welder produced by JASCI were recorded. Table 1 shows the welding parameters of the
company. The traditional non-wire-feed GTAW welding 10 group experiments and the penetration states of the weld
technique was implemented in the experiments. The thick- seam accordingly.
ness of base metal is 4 mm, and the material of it is Q-235
mild steel. During welding process, the weld gun is static,
while the base metal moves with a certain velocity. The
sound acquisition system consists of a CRY331 microphone, 3 The time and frequency characteristics
CRY506 amplifier, and CRY575 power source, and these of arc sounds
instruments are produced by CRYSOUND electronics Co.
LTD. The microphone is mounted on the same axis with the The arc sound signals collected from three penetration states
welding gun, and the distance from it to the welding arc is are shown in Fig. 3. It is found that there is no obvious dif-
200 mm. A USB3202 data acquisition card is adopted to ference between them. Therefore, it is difficult to distinguish
collect the arc sound signals. The acquisition frequency of the penetration states of weld seam from time domain of arc
arc sounds is 40 kHz. sound signals.
For analyzing the characteristics of arc sound signals col- Short-time Fourier transform (STFT) is a traditional
lected under various penetration states, 10 groups of welding method in the digital signal processing field. It is commonly

Argon gas cylinder

Welding gun

Moving platform

Welding power
source Welding gun Voice
collector
Base metal

Fig. 2  Welding experimental system

13
2512 Welding in the World (2022) 66:2509–2520

Table 1  Welding parameters g (τ − t) is a window function, and x(τ) is the original sig-
Groups Welding cur- Welding speed Penetration states nal. In this study, the Hanning window function is adopted
rent (A) (mm/min) and the expression of it is shown in Eq. (2).
{ ( ) |t| ≤ T
1 114 24.4 Non-penetration
|t| ≥ T
g(t) = T1 21 + 12 cos 𝜋t (2)
2 105 22.0 Non-penetration T

3 100 18.3 Non-penetration


Considering the frequency domain resolution and time
4 98 22.0 Non-penetration
domain resolution comprehensively, the frame size of Han-
5 115 23.2 Full-penetration
ning window is set as 6.25 ms, and frame shift is set as
6 115 23.2 Full-penetration
2.5 ms in this study. Figure 4 shows the STFT spectrum of
7 115 22.9 Full-penetration
arc sound signals under three penetration states.
8 114 24.4 Full-penetration
The STFT spectrum image contains the time and fre-
9 110 20.7 Excessive-penetration
quency information of arc sounds. It is observed from
10 110 17.1 Excessive-penetration
Fig.  4 that the STFT spectrum images of arc sounds in
the three penetration states are slightly different. Based on
used for analyzing the non-stationary signals. The mathe-
these images, the penetration states of weld seam could be
matical expression of STFT is shown in Eq. (1).
distinguished.
+∞ [ ]

STFT(t, f ) = x(𝜏)g(𝜏 − t) e−j2𝜋f 𝜏 d𝜏 (1)
−∞

6 6 6
Arc sounds s/V

4 4 4
A r c s o u n d ss/V

A r c s o u n d ss/V
2 2 2
0 0 0
-2 -2 -2
-4 -4 -4
-6 -6 -6
0 1 2 3 4 5 0 1 2 3 4 5 0 1 2 3 4 5
Time t/s Time t/s Time t/s
(a) Non-penetration (b) Full-penetration (c) Excessive-penetration

Fig. 3  Arc sound signals under various penetration states. a Non-penetration. b Full-penetration. c Excessive-penetration

20 20 20
Frequency f/kHz

Frequency f/kHz

Frequency f/kHz

16 16 16

12 12 12

8 8 8

4 4 4
2 2 2
0 0 0
0 1 2 3 4 5 0 1 2 3 4 5 0 1 2 3 4 5
Time t/s Time t/s Time t/s
(a) Non-penetration (b) Full-penetration (c) Excessive-penetration

Fig. 4  STFT spectrum of arc sound signals under various penetration states. a Non-penetration. b Full-penetration. c Excessive-penetration

13
Welding in the World (2022) 66:2509–2520 2513

4 Penetration state recognition with deep excessive-penetration states is 400 respectively. The time
learning method length of each segment signal is 100 ms. From the data
set, 70% of them are taken as training data set, and the rest
4.1 Convolutional neural network model 30% of them are taken as test set.
For investigating the classification processing of the
For distinguishing weld seam penetration states from the CNN network, as shown in Fig. 5, the arc sound spectrum
STFT spectrum images of arc sound signals, a deep learn- image features in the input and hide layer of fully con-
ing neural network is constructed and its model is shown nected network are extracted, and the principal compo-
in Fig. 5. The basic structure of this CNN network comes nents of these features are analyzed. The contribution rates
from VGG16. It consists of two convolution layers and one of the principal components are listed in Table 2.
full-connected layer. The full-connected neural network con- Based on Table 2, the sum of the first two principal
sists of input layer, hide layer, and output layer. The neuron components’ contribution rate is more than 80%. Fig-
number of input layer is 17856, and the neuron number of ure 6 shows the scatter diagram of the first two principal
hide layer is 1024. In the output layer, the number of neu- components.
ron is 3, and each neuron represents one penetration state It is observed from Fig. 6, in the input layer, that the
respectively. image features of the three penetration states’ arc sound
In the training stage, cross-entropy loss function is are distinguished roughly, while, in the hide layer, the
adopted and its expression is shown in Eq. (3). three penetration states are distinguished thoroughly.
The identification result of the designed CNN network is
N
1 ∑[ ∗ ] shown in Table 3. It is found that the accuracy rate is more
C=− y 1ny + (1 − y∗ )1n(1 − y) (3)
N i=1 than 99%.

y* is the label of the training samples; y is the predicted 4.3 Anti‑interference performance analysis
value of neural network; N is the number of training sample.
The STFT spectrum images of arc sound signals are taken as Generally, in the welding site the arc sound signals are easily
inputs of the deep learning neural network, and the size of influenced by background noise. Therefore, the anti-interfer-
the image is 126 × 38 pixel. The labels of non-penetration, ence performance of the designed method is crucial for its
full-penetration, and excessive-penetration are denoted as application. To analyze the anti-interference performance of
100, 010, and 001, respectively. the proposed method, white Gaussian noises are added into
the original welding arc sound signals. Assume the original
4.2 Recognition results

From the ten group welding experiments, 1200 segment Table 2  Contribution rates of the principal components
arc sound signals are selected randomly and they are taken Principle components 1th 2th 3th 4th 5th
as the training and testing data set of the designed CNN
network. In the 1200 segments, the number of samples Input layer 0.670 0.153 0.030 0.015 0.01
that come from non-penetration, full-penetration, and Hide layer 0.801 0.133 0.029 0.013 0.007

Convolution Max Pooling Convolution Max Pooling Flatten


fully-connected network
RELU
Input Softmax
RELU
(output)
5 5
2 2
5 5 2 2
64 31 9
32 63 19 3 1
126 38 64 63 19 1024 1
32 126 38 17856 1

Fig. 5  The model of CNN network

13
2514 Welding in the World (2022) 66:2509–2520

Fig. 6  Distribution of penetra- 30

First principal component


tion state features in full-con- 40 Non-penetration Non-penetration

First principal component


nected network. a Input layer. b Full-penetration Full-penetration
30 20
Hide layer Excessive-penetration Excessive-penetration
20
10 10

0
0
-10
-20 -10
-30
-20
-45 -30 -15 0 15 30 45 -45 -30 -15 0 15 30 45
Second principal component Second principal component
(a) input layer (b) hide layer

Table 3  Penetration state Penetration state Non-penetration Full-penetration Excessive- Whole


identification results based on penetration recognition
arc sound STFT images and rate
CNN network
Non-penetration 99.2% 0.8% 0 -
Full-penetration 0.8% 99.2% 0 -
Excessive-penetration 0 0 100% -
Whole recognition rate - - - 99.5%

arc sound signals are s(n) and the white Gaussian noises that, with the decrease of SNR, the identification accuracy
are d(n), then the sound signals x(n) after added noises are rates of non-penetration and excessive-penetration are
expressed as Eq. (4). almost not change, but the full-penetration state decreases
significantly. When SNR is 10 dB, the identification accu-
x(n) = s(n) + d(n) (4) racy rate of full-penetration is less than 50%. If the SNR is
Generally, the original arc sound signals s(n) are inde- 5 dB, then the accuracy rate is 0%. In the welding process,
pendent with noise signals d(n), i.e., E[s(n)d(n)] = 0. Signal- monitoring the full-penetration state of weld seam is crucial.
to-noise ratio (SNR) is adopted to evaluate the intensity of Therefore, the anti-interference performance of the designed
noise. The definition of SNR is shown in Eq. (5). method should be improved.
[N−1 N−1
]
∑ ∑
(5)
2 2
SNR = 10log S (n)∕ d (n) 5 Penetration recognition with auditory
spectrum images of sounds
n=0 n=0

SNR reflects the energy ratio of signals with noises, and


the unit of it is dB. Equation (5) manifests that large noises 5.1 The principle of sound signal processing
added will result in lower SNR value. Figure 7 shows STFT in the auditory system
spectrum of arc sound signals when SNR is 5 dB.
Compared with Fig. 4, the signals in the high-frequency The diagram of the human auditory system structure is
area increase, which perhaps influences the identification shown in Fig. 10. The human auditory system consists of
accuracy of the designed method. Figure 8 shows the scatter the periphery and central hearing system. The periphery
diagram of the first two principal components of arc sound system includes out-ear, middle-ear, and inner-ear. Sound
STFT spectrum images in the hide layer. Compared with waves are received firstly by the auricle. After transmitting
Fig. 6, many samples of non-penetration and full-penetration through the ear canal, sound waves arrive at the eardrum,
are mixed together. Therefore, it is difficult to distinguish and cause the vibration of the eardrum. This vibration is
them out. perceived by the basement membrane of the cochlea. On
Figure 9 shows the changes of identification accuracy the surface of the basement membrane, there are millions of
rates with the SNR decreasing of arc sound signals. It shows hair cells. The vibration of the basement membrane causes
the nervous impulses of hair cells. The nervous impulses of

13
Welding in the World (2022) 66:2509–2520 2515

Frequency f/kHz 20 20 20

Frequency f/kHz

Frequency f/kHz
16 16 16

12 12 12

8 8 8

4 4 4
2 2 2
0 0 0
0 1 2 3 4 5 0 1 2 3 4 5 0 1 2 3 4 5
Time t/s Time t/s Time t/s
(a) Non-penetration (b) Full-penetration (c) Excessive-penetration

Fig. 7  STFT spectrum of arc sound signals when SNR is 5 dB. a Non-penetration. b Full-penetration. c Excessive-penetration

50 2.0
Non-penetration
non-penetration
First principal component

40 Full-penetration
Excessive-penetration 1.5 full-penetration
30 accuracy excessive-penetration
20 1.0
10
0.5
0
-10 0.0
20 15 10 5 0
-20
-45 -30 -15 0 15 30 45 Signal-to-noise ratio SNR/db
Second principal component
Fig. 9  The changes of identification accuracy rates with SNR of arc
Fig. 8  Distribution of penetration state features in the hide layer after sound signals
when SNR is 5 dB

hair cells are transformed to the auditory center through the


auditory nerve. In the auditory center, the sound signals are Finally, based on the timbre features of sounds, some sub-
processed and then interpreted by the brain. jective assessments are made in the human brain and the
In the term of signal processing, the intensities of sound characteristics of sound source are distinguished.
wave signals received by the auricle are amplified dur- Most of physiological acoustic researches show that
ing transmitting in the out-ear. While in the middle-ear, the human auditory system has strong anti-interference
the intensities of sound wave are restrained. The extends performance. In this study, we will simulate the functions
or restrains of sound intensities depend on the frequency of the periphery auditory system to process the arc sound
of sound waves. In general, the most sensitive frequency signals, and extract the features of them.
range of the human auditory system is 3–4 kHz. When
sound waves arrive at the cochlea, they are resolved into a 5.2 CNN network for auditory spectrum images
number of frequency bands. Based on the center frequency of sounds
value of each band, the intensities of sound in each band
are amplified or restrained. Usually, the auditory attributes In this study, the human auditory model and CNN network
of sound waves that resolved into specific band are called are combined to build a hybrid model for identifying pene-
timbre. The timbre of sound waves is perceived by the tration states of weld seam. The construction of the proposed
auditory center system and interpreted by the human brain. hybrid model is shown in Fig. 11. As shown in Fig. 11, the

13
2516 Welding in the World (2022) 66:2509–2520

cerebral
auricle cortex
otosteon cochlea
cartilaginous
part
sound
wave
cavum
conchae eardrum auditory
nerve
ear canal
out-ear middle-ear inner-ear auditory center

Fig. 10  The structure of human auditory system

External and middle ear


transfer function
20

15
Gammatonefilter bank Auditory spectrogram
Arc sound signal
10
frequency resolution of arc sound
dB

5 0
0

-5
-50
0.1 1 10
frequency/kHz
-100
external ear
dB

0
-5 -150
-10
-15 -200
-20
dB

-25
-30
-35
-40
0.1 1 10
frequency/kHz

middle ear Convolutional neural network

Identification of
penetration states

Fig. 11  Hybrid model based on the human auditory principle and CNN network

arc sound signals are filtered firstly with external and mid- National Standard Institute [19] has published a universal
dle ear transfer function. And then Gamma-tone filter bank external and middle ear transform function. In this study,
frequency resolution is adopted for simulating the cochlea this transform function is adopted directly.
function and obtaining the auditory spectrum of arc sounds. During sound wave transmitting within the cochlea, it
Finally, based on the auditory spectrum images of arc generates stimulation on the different parts of the cochlea.
sounds, a CNN network is applied to identify the penetra- High-frequency signals stimulate the bottom part of the
tion states. The CNN model is the same as in Fig. 5. cochlea, and low-frequency signals stimulate the top part of
the cochlea. Greenwood et al. [20] found that the relations
5.3 Auditory spectrum of arc sounds of frequency f with the position of the cochlea x could be
expressed by Eq. (6).
For acquiring auditory spectrum of arc sounds, external and
middle ear transform function is built firstly. The American

13
Welding in the World (2022) 66:2509–2520 2517

( )
f = B 10a(L−x) − k (6) 0

Frequency response gain/dB


L is the length of the cochlea; f is the frequency of sound;
x is the position; B and k are constants. To simulate the -50
frequency resolving function of the cochlear, a number of
models were proposed by researchers. Among those models,
the Gamma-tone model proposed by Johannesma [21] has -100
relatively simple expression and mature application. There-
fore, it is adopted in this study to discompose the arc sound
signals, and its expression is shown in Eq. (7).
-150

gm = Atn−1 exp(−2𝜋Bm t)cos(2𝜋f m t + ∅m )u(t) (7)


-200
gm is the impulse response function; 1 < m < M; M is the
0 4 8 12 16
number of filters; A is the gain of filter; n is the order of fil- Frequency f/kHz
ter; fm is the center frequency of the filter; 𝜙m is the phase of
signal; U(t) is a unit step function; Bm is the transformed fre- Fig. 12  Frequency response of the designed Gamma-tone filter bank
quency of fm in the equivalent rectangular bandwidth (ERB).
Bm = 1.019ERB(fm ) (8)

( )
ERB fm = 24.7(0.00437fm + 1) (9) Arc sound s/v

Based on Eq. (8), to acquire gm, it is needed to determine


the number of filters M, the center frequency fm, the signal
phase 𝜙m , and the order of filter n. Generally, the human ear
is insensitive to the phase of sound signals, so 𝜙m is usually
set as zero. In this study, M is set as 64, and n is set as 4.
The center frequency fm is obtained with Eqs. (10) and (11).
( )
9.26(In fH + 228.7 − In(fL + 228.7))
v= (10)
M
Fig. 13  Arc sound signals after resolved by the Gamma-tone model
( ) mv
fm = fH + 228.7 e− 9.26 − 228.7 (11)

where fH is the highest cutoff frequency; fL is lowest cutoff It is observed from Fig. 14 that the auditory spectrum
frequency; ν is the overlap of two adjacent filters. Consider- of arc sound mainly distributes in the frequency band of
ing the human auditory system is insensitive to the sound 1–8  kHz. Compared with Fig.  4, frequency distribution
signals whose frequency is more than 16 kHz, in this study, bandwidth becomes large in the auditory spectrum of arc
fH is set as 16 kHz, and fL is set as 1 Hz. Figure 12 shows sound. The auditory spectrum image contains more informa-
frequency response of the designed Gamma-tone filter bank. tion than the STFT spectrum image, which is beneficial for
The center frequencies of the designed filter bank distribute penetration identification.
non-linearly. In the low frequency, the filter bank has high
resolution, while in the high frequency, the filter bank has 5.4 Recognition results
low resolution.
Figure 13 is an example of the arc sound signals decom- For comparing with the proposed method in Sect. 4, the
posed into 16 bands by a Gamma-tone filter bank. If the same data sets are used in this section for training and testing
diagrams in every frequency band are stacked up and shown the CNN network. The scatter diagram of the first two prin-
in a 2-dimensional diagram, the auditory spectrum images cipal components of arc sound auditory spectrum images in
of arc sounds are acquired. The arc sound auditory spec- the hide layer is shown in Fig. 15.
trum images under the three penetration states are shown Figure 15 shows the three types of penetration states are
in Fig. 14. divided thoroughly. Compared with Fig. 6, the distribution
of each penetration state is more concentrated. The iden-
tification result of the based auditory spectrum images of

13
2518 Welding in the World (2022) 66:2509–2520

Frequency f/kHz 15 15 15

Frequency f/kHz

Frequency f/kHz
12 12 12

8 8 8

4 4 4
2 2 2
0 0 0
0 1 2 3 4 5 0 1 2 3 4 5 0 1 2 3 4 5
Time t/s Time t/s Time t/s
(a) Non-penetration (b) Full-penetration (c) Excessive-penetration

Fig. 14  Auditory spectrum images of arc sounds. a Non-penetration. b Full-penetration. c Excessive-penetration

40 auditory spectrum images of arc sound signals when the


Non-penetration SNR is 5 dB after mixed white Gaussian noises.
First principal component

35 Full-penetration Compared with Fig. 14, more auditory spectrum distrib-


Excessive-penetration utes in the high-frequency area. However, in the frequency
30 band of 1–8 kHz, no significant changes could be found.
Figure 17 shows the scatter diagram of the first two princi-
25 ple components of arc sound auditory spectrum images in
the hide layer when SNR is 5 dB. Compared with Fig. 14,
20 there is no significant change. The three penetration types
are still easily distinguished from this scatter diagram.
15
Figure 18 shows the changes of identification accuracy
10 rates with the SNR decreasing of arc sound signals. It is
found that with the SNR deceasing of arc sound signal the
5 identification accuracy rate keeps a relatively high value,
-30 -15 0 15 30 which means that based on auditory spectrum images the
Second principal component anti-interference performance of the proposed identifica-
tion method is significantly improved.
Fig. 15  Distribution of auditory spectrum features in the hide layer

6 Conclusions
the CNN network is shown in Table 4. The accuracy rate is
more than 99%. For monitoring the penetration states of weld seam
through arc sound signals, the time–frequency spectrum
5.5 Anti‑interference performance analysis images of arc sounds are acquired. Based on these images,
with a convolution neural network, the penetration states
A series white Gaussian noises with different intensities of weld seam are identified. For improving the anti-inter-
are mixed into the arc sound samples. Figure 16 shows the ference ability of the developed method, a human auditory

Table 4  Penetration state Penetration state Non-penetration Full-penetration Excessive- Whole


identification results based on penetration recognition
arc sound auditory spectrum rate
images and CNN network
Non-penetration 98.3% 1.7% 0
Full-penetration 0 100% 0
Excessive-penetration 0 0 100%
Whole recognition rate 99.4%

13
Welding in the World (2022) 66:2509–2520 2519

15 15 15

Frequency f/kHz

Frequency f/kHz
12 12
Frequency f/kHz

12

8 8 8

4 4 4

0 0 0
0 1 2 3 4 5 0 1 2 3 4 5 0 1 2 3 4 5
Time t/s Time t/s Time t/s
(a) Non-penetration (b) Full-penetration (c) Excessive-penetration

Fig. 16  Auditory spectrum images of arc sounds when SNR is 5 dB. a Non-penetration. b Full-penetration. c Excessive-penetration

system model is built, and the auditory spectrum images


20
Non-penetration of arc sounds are obtained. The conclusions of this study
are the following.
First principal component

15 Full-penetration
Excessive-penetration
1. Based on the time–frequency images of arc sound sig-
10 nals, a convolution neural network can identify the pen-
etration states of weld seam with relatively high accu-
5 racy rate. However, the anti-interference ability of this
method is not acceptable for practice application.
0 2. The auditory spectrum images of arc sound signals
reflect the penetration states of weld seam. Based on
-5 these images, the anti-interference ability of the identi-
fication method is improved significantly. Even when the
-10 signal-to-noise ratio is less than 5 dB, the identification
-30 -15 0 15 30 accuracy rate is still more than 95%.
Second principal component

Fig. 17  Distribution of auditory spectrum features in the hide layer


when SNR is 5 dB Author contribution  Yanfeng Gao: conceptualization, methodology,
writing—original draft preparation, funding acquisition.
Qisheng Wang: investigation.
2.0 Jianhua Xiao: data curation, visualization.
Genliang Xiong: writing—review and editing.
non-penetration Hua Zhang: supervision, project administration.

1.5 full-penetration Funding  This work was supported by the Natural Science Foundation
excessive-penetration of Shanghai (21ZR1425900, 21010501600).
accuracy

1.0 Data availability  Not applicable.

Code availability  Not applicable.


0.5
Declarations 
0.0 Ethics approval  Not applicable
20 15 10 5 0
Signal-to-noise ratio SNR/dB Consent to participate  All the authors listed have participated in the
preparation of the manuscript.

Fig. 18  The identification accuracy rates with SNR of arc sound sig-


nals

13
2520 Welding in the World (2022) 66:2509–2520

Consent for publication  This manuscript is approved by all the authors 10 Haichao L, Liu J, Xie J, Wang X (2019) GTAW penetration pre-
for publication. It is the original research that has not been published diction model based on convolution neural network algorithm. J
previously, and not under consideration for publication elsewhere, in Mech Eng 55:22–28
whole or in part. 11. Ren W, Wen G, Liu S et al. (2018) Seam penetration recognition
for GTAW using convolutional neural network based on time-
frequency image of arc sound. IEEE 23rd Int Conf Emerg Tech-
Conflict of interest  The authors declare no competing interests.
nol Fact Autom ETFA 2018-Sept, pp 853–860. https://d​ oi.o​ rg/1​ 0.​
1109/​ETFA.​2018.​85024​78
12. Wu D, Huang Y, Zhang P et al (2020) Visual-acoustic penetration
References recognition in variable polarity plasma arc welding process using
hybrid deep learning approach. IEEE Access 8:120417–120428.
https://​doi.​org/​10.​1109/​ACCESS.​2020.​30058​22
1. Xiao RQ, Xu YL, Hou Zh, Chen H, Chen SB (2019) An adap-
13. Shamma SA (1985) Speech processing in the auditory system
tive feature extraction algorithm for multiple typical seam track-
II: lateral inhibition and the central processing of speech evoked
ing based on vision sensor in robotic arc welding. Sens Actua-
activity in the auditory nerve. J Acoust Soc Am 78:1622–1632.
tors A:Physical 297:111533. https://​doi.​org/​10.​1016/j.​sna.​2019.​
https://​doi.​org/​10.​1121/1.​392800
111533
14. Patterson RD, Moore BCJ (1986) Auditory filters and excitation
2. Yang D, Wang G, Zhang G (2017) A comparative study of
patterns as representations of auditoryfrequency selectivity. In:
GMAW- and DE-GMAW-based additive manufacturing tech-
Moore BCJ (ed) Frequency Selectivity in Hearing. Academic
niques: thermal behavior of the deposition process for thin-walled
Press, London, pp 123–177
parts. Int J Adv Manuf Technol 91:2175–2184. https://​doi.o​ rg/1​ 0.​
15. Hewitt MJ, Meddis R (1991) An evaluation of eight computer
1007/​s00170-​016-​9898-0
models of mammalian inner hair-cell function. J Acoust Soc Am
3. Huang J, Yang M, Chen J et al (2018) The oscillation of stationary
90:904–917. https://​doi.​org/​10.​1121/1.​401957
weld pool surface in the GTA welding. J Mater Process Technol
16. Wang K, Shamma SA (1995) Spectral shape analysis in the central
256:57–68
auditory system. IEEE Trans Speech Audio Process 3:382–395.
4. Song S, Chen H, Lin T et al (2016) Penetration state recogni-
https://​doi.​org/​10.​1109/​89.​466657
tion based on the double-sound-sources characteristic of VPPAW
17. Martin KD (1999) Sound-Source Recognition: a Theory and
and hidden Markov Model. J Mater Process Technol 234:33–44.
Computational Model. Dissertation,Massachusetts Institute of
https://​doi.​org/​10.​1016/j.​jmatp​rotec.​2016.​03.​002
Technology
5. Zhang Z, Chen S (2017) Real-time seam penetration identifica-
18. Patterson RD (2000) Auditory images : how complex sounds
tion in arc welding based on fusion of sound, voltage and spec-
are represented in the auditory system. Acoust Sci Technol
trum signals. J Intell Manuf 28:207–218. https://​doi.​org/​10.​1007/​
21:183–190
s10845-​014-​0971-y
19. ANSI S3.4-2007 (2007) Procedure for the computation of loudness
6. Lv N, Xu Y, Li S et al (2017) Automated control of welding pen-
of steady sounds. American NationalStandards Institute, Melville
etration based on audio sensing technology. J Mater Process Tech-
20. Greenwood DD (1990) A cochlear frequency-position function for
nol 250:81–98. https://​doi.​org/​10.​1016/j.​jmatp​rotec.​2017.​07.​005
several species—29 years later. J Acoust Soc Am 87:2592–2605
7. Zhang Z, Wen G, Chen S (2018) Audible sound-based intel-
21. Johannesma PIM (1988) Spectro-temporal interpretation of
ligent evaluation for aluminum alloy in robotic pulsed GTAW:
activity patterns of auditory neurons GERARD H. F M HES-
mechanism, feature selection, and defect detection. IEEE Trans
SELMANS 51:31–51
Ind Informatics 14:2973–2983. https://​doi.​org/​10.​1109/​TII.​2017.​
27752​18
Publisher's note Springer Nature remains neutral with regard to
8. Gao Y, Zhao J, Wang Q et al (2020) Weld bead penetration identi-
jurisdictional claims in published maps and institutional affiliations.
fication based on human-welder subjective assessment on welding
arc sound. Meas J Int Meas Confed 154:107475. https://​doi.​org/​
10.​1016/j.​measu​rement.​2020.​107475 Springer Nature or its licensor holds exclusive rights to this article under
9. Gao Y, Wang Q, Xiao J, Zhang H (2020) Penetration state identi- a publishing agreement with the author(s) or other rightsholder(s);
fication of lap joints in gas tungsten arc welding process based on author self-archiving of the accepted manuscript version of this article
two channel arc sounds. J Mater Process Tech 285:116762. https://​ is solely governed by the terms of such publishing agreement and
doi.​org/​10.​1016/j.​jmatp​rotec.​2020.​116762 applicable law.

13

You might also like