Professional Documents
Culture Documents
a r t i c l e i n f o a b s t r a c t
Article history: Cross-technology interference sources poses a great challenge for improving throughputs of wireless local
Available online 30 July 2021 area networks (WLAN) and wireless interference signal recognition (WISR) can provide a precondition
for mitigating this problem. The quadruple generative adversarial network (QGAN) has shown its
Keywords:
prevailing performance for specific emitter identification (SEI). In this paper, an enhanced collaborative
Wireless interference signal recognition
Quadruple generative adversarial network
learning mechanism is proposed to improve QGAN’s performance for WISR. ACGAN is involved in
Semi-supervised learning the Improved-QGAN architecture to substitute original GAN, and loss functions are further optimized
Knowledge distillation for generative, representation and classification sub-networks. Besides, a lightweight model based on
knowledge distillation (KD) is presented to reduce memory consumption and computational complexity
at inference phase. Numerical results indicate that the proposed Improved-QGAN outperforms the other
baseline algorithms both on the experimental dataset and benchmark dataset.
© 2021 Elsevier Inc. All rights reserved.
https://doi.org/10.1016/j.dsp.2021.103188
1051-2004/© 2021 Elsevier Inc. All rights reserved.
X. Xu, T. Jiang, J. Gong et al. Digital Signal Processing 117 (2021) 103188
2
X. Xu, T. Jiang, J. Gong et al. Digital Signal Processing 117 (2021) 103188
3
X. Xu, T. Jiang, J. Gong et al. Digital Signal Processing 117 (2021) 103188
In addition, the generated data can be supplementary labeled 3.1.4. Discriminative model D
samples to optimize C, the loss function of these data is The discriminator D aims to determine the source of its in-
put correctly, i.e., real or pseudo data distribution. The data-label
C E −G (θC ) = E p g − log p c y g
f g . (10) pair where both the data and label are true is considered as real
data distribution, and the discriminator outputs “1”. The other two
As with Triple-GAN, we add the adversarial training term via D pseudo data-label pairs, where the data is generated by G and the
to C, the optimization of the classifier can be formulated as fol- label is true, or the data is true and the label is predicted by C,
lows: are regarded as fake data distributions, and the discriminator out-
puts “0”. The corresponding optimization formulation is presented
min C E −real (θC ) + ν C E −G (θC ) as follows:
C
(11)
+ ηExu , ỹ u ∼ p u log 1 − D f u , ỹ u , max Exl , y ∼ pl log D f l , yl
D l
+ α Exu , ỹ u ∼ p u log 1 − D f u , ỹ u (15)
where ν is another weighting factor.
+ (1−α ) Ex g , y g ∼ p g log 1 − D f g , y g
3.1.3. Generative model G The details of the optimization process for Improved-QGAN are
The generator G attempts to generate data G g = G z g , y g ∼
summarized in Algorithm 1.
p g f g
z g , y g similar to the true data distribution pl f l , yl . Ide-
ally, the generated data just fits the real data distribution and can Algorithm 1 The proposed semi-supervised Improved-QGAN
further improve the performance of C by providing meaningful la- model.
beled data beyond real data. Require:
As we know, forcing a model to perform additional tasks can Labeled known data X l and corresponding label yl , unlabeled data X u ;
The number of training iterations iter;
improve performance on the original task [19,20]. ACGAN [21] is
Learning rate ς R , ςG , ςC , ς D for network R, G, C, D;
a variant of the GAN architecture which introducing an auxiliary Batch sizes for real labeled, unlabeled and generated data: bl , b u , b g ;
classifier and has superiority on generating samples more discrim- Parameters of networks (
R ,
G ,
C ,
D );
inable. In original ACGAN, generator G uses both class label y and Several hyper-parameters α , β, μ, λ, η, ν
1: for i = 1 to iter do
latent vector z to generate images X f ake = G ( y , z ). Then discrim- 2: Sample a batch of labeled data (xl , yl ) of size bl from ( X l , yl ), a batch of
inator D gives a probability over sources from real or fake distri- unlabeled data xu of size b u from X u , a batch of latent vector z and real
bution and a probability over the class label p ( S | X ) , p (Y | X ) = label y g of size b g from distribution p z ( z ) and p ( y l ) respectively;
3: Get presentation of labeled data f l ← R θ R (xl) and unlabeled data f u ←
D ( X ). The loss function includes two parts: s for correct source R θ R (xu ), get generated data f g ← G θG z , y g , get predicted labels ỹl ←
and c for correct class and D tries to maximize s +c , while G is C θC f l , ỹ u ← C θC f u , ỹ g ← C θC f g ;
optimized to maximize c −s . 4: Make
up
real data-label
pair f l , yl and two pseudo data-label pairs
f g , y g , f u , ỹ u ;
5: Update the discriminative model D
S = E [log p ( S = real | X real ) + E log p S = f ake
X f ake θ D ← Adam (∇
θ D E pl log D f l , yl
+α E p u log 1 − D
C = E [log p (Y = y | X real )] + E log p Y = y
X f ake . f u , ỹ u
+ (1−α ) E p g log 1 − D f g , y g ), θ D , ς D )
6: Update the representation model R
(12) θ R ← Adam +μC E −real (θ R )
(∇ θ R (recon
(θ R )
+η E p u log 1 − D f u , ỹ u ), θ R , ς R )
Motivated by ACGAN, we modified the loss function of G in 7: Update the classification model C
Improved-QGAN. The quality of the data generated by G at the θC ← Adam (∇ ) +ν C E −G (θC )
θC (C E−real (θC
+γ E p u log 1 − D f u , ỹ u ), θC , ςC )
beginning may be poor, however, D still tries to classify them cor- 8: Update the generative model G
rectly, which may have a negative effect on D. We divide c into θG ← Adam (∇θG (C −fake (θ g )
two parts from different perspectives of D and G. For D, we only +λ E p g log(1 − D ( f g , y g )) ), θG , ςG )
9: end for
maximize C −real =E [log p (Y = y |
X real )] of real data. For G, we
10: return Learned parameters (
R ,
G ,
C ,
D );
maximize C −fake =E log p Y = y
X f ake of fake data via D. This
loss aims at making generated data fool D to assign correct labels.
We apply the variation to Improved-QGAN by already existing clas- 3.2. Theoretical analysis
sifier C. As mentioned above, C only maximizes C −real of real data
We assume that the feature f outputted by R is suitable for
and that’s done by cross entropy as Eq. (11). As for G, in Improved-
subsequent classification. For fixed R, the adversarial losses of G
QGAN, G takes z g and y g as input and outputs the generated data
and C can be rewritten as:
f g . Then f g is sent to C and C gives the predicted label ỹ g . Nat-
urally, the loss function of G is denoted as α E log 1 − D f u , ỹ u + (1 − α ) E log 1 − D f g , y g
pu pg
C −fake θ g = −E log p ỹ g = y g
f g . (13) = α p u ( f , y ) log (1 − D ( f , y )) dyd f
We add the adversarial training term in the same way, the loss
+ (1 − α ) p g ( f , y ) log (1 − D ( f , y )) d f d f
function of G can be defined as follows:
= p α ( f , y ) log (1 − D ( f , y )) d yd f
min C −fake θ g + λEx g , y ∼ p g log 1 − D f g , y g , (14)
G g
= E [log (1 − D ( f , y ))] ,
where the hyper-parameter λ controls the trade-off between the pα
4
X. Xu, T. Jiang, J. Gong et al. Digital Signal Processing 117 (2021) 103188
where p α ( f , y ) =α p u ( f , y ) + (1 − α ) p g ( f , y ).
Then the loss function of the minimax game between G, C and
D is equivalent to the following expression:
5
X. Xu, T. Jiang, J. Gong et al. Digital Signal Processing 117 (2021) 103188
Table 1
Collection settings for WISR dataset.
Table 2
Network training configurations for Improved-
QGAN.
Number of classes 8 / 11
Training set size 4.0e4 / 4.5e4
Learning rate 1e-3
Batch size 100 / 200
Iterations 400 Fig. 3. Classification accuracy comparison with different amount of labeled data for
Optimizer Adam WISR dataset.
Length of data 1024
[μ, η, ν , λ, α , β ] [1, 0.01, 0.1, 0.1, 0.5, 0.4] F1 score: Let T P , F P and F N stand for the number of predict-
ing the true positive samples as positive samples, the number of
predicting the true negative samples as positive samples, and the
Noise is a common assumption in communication domain. To sim- number of predicting the true positive samples as negative sam-
plify the problem, the signal is added additive white Gaussian ples. Then performance metrics precision P c , recall R c and F1 score
noise to obtain dataset with different SNR. Table 1 shows the de- can be denoted by:
tails of the dataset. The amount of data for each category is 5000,
TP TP 2P c R c
and the length of each sample is 1024. 80% randomly selected sig- Pc = , Rc = , F1 =
nal data samples are used for training and the rest for testing and TP + FN TP + FP Pc + Rc
validation. Confusion matrix: In the field of deep learning classification, a
confusion matrix, also known as an error matrix, is a specific table
4.1.2. Automatic modulation classification layout that allows visualization of the performance of an algorithm.
The dataset we use for AMC is benchmark dataset generated A confusion matrix is a tabular summary of the number of correct
in [28]. Specifically, it’s made up of 11 categories which are com- and incorrect predictions made by a classifier and can show the
monly seen in impaired environments, including the following identification performance clearly.
modulation formats: OOK, 4ASK, BPSK, QPSK, 8PSK, 16QAM, AM-
SSB-SC, AM-DSB-SC, FM, GMSK, OQPSK. Each sample is corrupted 4.4. Numerical results
by random noise, time offset, phase, and wireless channel distor-
tions. The amount of data for each category is 4096, and the length In this subsection, we will present various simulation results of
of each sample is 1024. 85% randomly selected examples are used our proposed model. We also evaluate performance of the obtained
for training and the rest for testing and validation. The SNR of lightweight model of Improved-QGAN.
dataset used in our experiment varies between −4 dB and 12 dB.
4.4.1. Classification performance
4.2. Implementation details Classification performance vs. amount of labeled data In semi-
supervised learning, the amount of labeled data is an important
In our experiments, the neural network is built by Tensorflow parameter, which can influence and evaluate the adaptability and
on Python 3.6 and trained on one NVIDIA GeForce GTX 1080 Ti performance of the algorithm in different situations. In this subsec-
GPUs. Configuration details of the model training are listed in Ta- tion, the classification accuracy under different amounts of labeled
ble 2. The architecture of R we used is CNNs with attention mech- data is tested. For WISR, The amount of labeled data is uniform
anism. The specific structures of R and C used for two tasks are distribution from 80 to 240 for each class and the SNR is about
different, while D and G are same. The details are given in Table 3. 4 dB. For AMC, The amount of labeled data is uniform distribution
from 400 to 1200 for each class and the selected SNR is 4 dB. The
4.3. Performance metrics input data length for both two tasks are 1024. Fig. 3 shows how
classification accuracy changes with the amounts of labeled data
In order to fully demonstrate and compare the classification for WISR, while Fig. 4 is based on AMC dataset. CNN stands the
performance of the proposed model and baseline models, we use network having the same architecture with R and C sub-networks
multiple performance metrics to show the experimental results. in Improved-QGAN.
Classification accuracy: The ratio of the number predicted cor- From Fig. 3 and Fig. 4, we can see that the classification perfor-
mance goes better as the amount of labeled data increases, and the
rectly and total sample number. Let predicted and true label be
classification performance of GAN-based model is better than CNN.
denoted as y and ỹ, then classification accuracy can be defined in
Besides, Improved-QGAN outperforms the baseline model (original
the following formula:
Triple-GAN and our previous QGAN) for both two tasks.
1
N
Acc = I ỹ i = y i . (22) Classification performance vs. data length The proposed model is
N a data-driven DL approach, and properties of input data can impact
i =1
6
X. Xu, T. Jiang, J. Gong et al. Digital Signal Processing 117 (2021) 103188
Table 3
Detailed network structure of Improved QGAN for different tasks.
Fig. 4. Classification accuracy comparison with different amount of labeled data for
Fig. 5. Classification accuracy comparison w.r.t. data length for WISR dataset.
AMC dataset.
7
X. Xu, T. Jiang, J. Gong et al. Digital Signal Processing 117 (2021) 103188
Table 4
Performance comparison at different SNR regimes.
Fig. 8. Classification accuracy comparison with different SNRs for AMC dataset.
8
X. Xu, T. Jiang, J. Gong et al. Digital Signal Processing 117 (2021) 103188
Table 5
A comparison of epochs used to achieve target accuracy.
Table 6
Computation comparison between Triple-GAN and Improved-QGAN.
9
X. Xu, T. Jiang, J. Gong et al. Digital Signal Processing 117 (2021) 103188
Table 7
Student compression ratio and classification accuracy for WISR.
Table 8
Student compression ratio and classification accuracy for AMC.
Table 9 References
Classification accuracy comparison of lightweight network for WISR.
Student model Teacher model P av g R av g F 1av g Acc [1] J. Du, C. Jiang, J. Wang, Y. Ren, Mérouane Debbah, Machine learning for 6G
wireless networks: carrying forward enhanced bandwidth, massive access, and
Teacher QGAN 0.937 0.930 0.931 0.933 ultrareliable/low-latency service, IEEE Veh. Technol. Mag. 15 (4) (Dec. 2020)
Improved-QGAN 0.957 0.956 0.956 0.957 122–134.
[2] M. Kulin, T. Kazaz, I. Moerman, E. De Poorter, End-to-end learning from spec-
Student 1 QGAN 0.930 0.929 0.928 0.929
trum data: a deep learning approach for wireless signal identification in spec-
Improved-QGAN 0.953 0.953 0.952 0.953
trum monitoring applications, IEEE Access 6 (2018) 18484–18501.
Student 2 QGAN 0.928 0.927 0.926 0.928 [3] X. Li, F. Dong, S. Zhang, W. Guo, A survey on deep learning techniques in wire-
Improved-QGAN 0.949 0.949 0.949 0.950 less signal recognition, Wirel. Commun. Mob. Comput. 2019 (2019) 1–12.
[4] K. Kim, I.A. Akbar, K.K. Bae, J.-S. Um, C.M. Spooner, J.H. Reed, Cyclostationary
Student 3 QGAN 0.924 0.925 0.924 0.926 approaches to signal detection and classification in cognitive radio, 2nd IEEE
Improved-QGAN 0.947 0.945 0.945 0.946 International Symposium on, IEEE, Apr. 2007.
[5] C.M. Spooner, A.N. Mody, J. Chuang, J. Petersen, Modulation recognition using
Student 4 QGAN 0.921 0.920 0.920 0.922
second- and higher-order cyclostationarity, in: 2017 IEEE International Sympo-
Improved-QGAN 0.942 0.941 0.941 0.943
sium on Dynamic Spectrum Access Networks (DySPAN), Mar. 2017.
Student 5 QGAN 0.918 0.914 0.914 0.916 [6] Z. Gan, L. Chen, W. Wang, Triangle generative adversarial networks, in: Pro-
Improved-QGAN 0.937 0.936 0.936 0.937 ceedings of the 31st International Conference on Neural Information Processing
Systems (NIPS), Dec. 2017.
[7] J. Gong, X. Xu, Y. Qin, W. Dong, A generative adversarial network based frame-
work for specific emitter characterization and identification, in: 11th Interna-
tional Conference on Wireless Communications and Signal Processing (WCSP),
Table 10 Oct. 2019.
Classification accuracy comparison of lightweight network for AMC. [8] C. Ledig, L. Theis, F. Huszar, J. Caballero, et al., Photo-realistic single image
Student model Teacher model P av g R av g F 1av g Acc super-resolution using a generative adversarial network, in: Proceedings of the
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Jul. 2017.
Teacher QGAN 0.987 0.988 0.987 0.988
[9] L. Yu, W. Zhang, J. Wang, Y. Yu, SeqGAN: sequence generative adversarial nets
Improved-QGAN 0.991 0.991 0.991 0.992
with policy gradient, in: Proceedings of the Thirty-First AAAI Conference on
Student 1 QGAN 0.975 0.973 0.973 0.975 Artificial Intelligence, Feb. 2017.
Improved-QGAN 0.982 0.982 0.982 0.983 [10] T. Schlegl, P. Seeböck, S.M. Waldstein, U. Schmidt-Erfurth, G. Langs, Unsuper-
vised anomaly detection with generative adversarial networks to guide marker
Student 2 QGAN 0.969 0.969 0.969 0.971 discovery, Inf. Process. Med. Imag. (May 2017).
Improved-QGAN 0.977 0.978 0.977 0.978 [11] I. Goodfellow, et al., Generative adversarial nets, in: Proceedings of the 27th
International Conference on Neural Information Processing Systems (NIPS), Dec.
Student 3 QGAN 0.964 0.964 0.964 0.966 2014.
Improved-QGAN 0.967 0.967 0.967 0.969 [12] Z. Pan, W. Yu, X. Yi, A. Khan, F. Yuan, Y. Zheng, Recent progress on generative
adversarial networks (GAN): a survey, IEEE Access 7 (2019) 36322–36333.
Student 4 QGAN 0.960 0.961 0.960 0.962
[13] A. Radford, L. Metz, S. Chintala, Unsupervised representation learning with
Improved-QGAN 0.968 0.969 0.968 0.970
deep convolutional generative adversarial networks, Comput. Sci. (2015).
Student 5 QGAN 0.959 0.955 0.955 0.958 [14] M. Mirza, S. Osindero, Conditional generative adversarial nets, Comput. Sci.
Improved-QGAN 0.966 0.966 0.966 0.968 (2014) 2672–2680.
[15] M. Arjovsky, S. Chintala, L. Bottou, Wasserstein generative adversarial networks,
in: Proceedings of the 34th International Conference on Machine Learning, Aug.
2017.
[16] T.J. Oshea, T. Roy, N. West, B.C. Hilburn, Physical layer communications system
processing, and Visualization. Xiaowei Qin: Writing-Reviewing and
design Over-the-Air using adversarial networks, in: 26th European Signal Pro-
Discussion. cessing Conference (EUSIPCO), Sep. 2018.
[17] D. Roy, T. Mukherjee, M. Chatterjee, E. Pasiliao, Detection of rogue RF transmit-
ters using generative adversarial nets, in: IEEE Wireless Communications and
Declaration of competing interest Networking Conference (WCNC), Apr. 2019.
[18] C. Zhu, L. Xu, X.-Y. Liu, F. Qian, Tensor-generative adversarial network with two-
dimensional sparse coding: application to real-time indoor localization, in: IEEE
The authors declare that they have no known competing finan-
International Conference on Communications (ICC), May 2018.
cial interests or personal relationships that could have appeared to [19] C. Szegedy, W. Liu, Y. Jia, et al., Going deeper with convolutions, in: IEEE Con-
influence the work reported in this paper. ference on Computer Vision and Pattern Recognition (CVPR), Jun. 2015.
10
X. Xu, T. Jiang, J. Gong et al. Digital Signal Processing 117 (2021) 103188
[20] B. Ramsundar, S. Kearnes, P. Riley, et al., Massively multitask networks for drug Ting Jiang received her B.Eng. degree in School of
discovery, Comput. Sci. (2015). Information and Communication Engineering, Dalian
[21] A. Odena, C. Olah, J. Shlens, Conditional image synthesis with auxiliary clas- University of Technology (DUT), Dalian, China, in
sifier GANs, in: Proceedings of the 34th International Conference on Machine 2018. She is currently pursuing her M.Eng. in the De-
Learning, Aug. 2017.
partment of Electronic Engineering and Information
[22] Y. Cheng, D. Wang, P. Zhou, T. Zhang, Model compression and acceleration for
deep neural networks: the principles, progress, and challenges, IEEE Signal Pro-
Science at University of Science and Technology of
cess. 35 (1) (2018) 126–136,Jan. China (USTC). Her main research topic is deep learn-
[23] C. Bucilua, R. Caruana, A. Niculescu-Mizil, Model compression, in: Proceedings ing based wireless communication and signal process-
of the Twelfth ACM SIGKDD International Conference on Knowledge Discovery ing.
and Data Mining, Aug. 2006.
[24] G. Hinton, O. Vinyals, J. Dean, Distilling the knowledge in a neural network,
Jialiang Gong received the B.Eng. degree in Elec-
Comput. Sci. (2015).
tronic and Information Engineering from University of
[25] T. Jiang, X. Qin, X. Xu, J. Chen, W. Dong, Lightweight quadruple-GAN for in-
terference source recognition in Wi-Fi networks, in: IEEE 6th International Science and Technology of China (USTC), Hefei, China,
Conference on Computer and Communications (ICCC), Dec. 2020. in 2016. He is currently pursuing his Ph.D. degree
[26] A. Aguinaldo, P. Chiang, A. Gain, A. Patil, K. Pearson, S. Feizi, Compressing GANs in School of Cyberspace Science and Technology at
using knowledge distillation, arXiv preprint, arXiv:1902.00159, 2019. USTC. His research interest is deep learning for spe-
[27] X. Wang, R. Zhang, Y. Sun, J. Qi, KDGAN: knowledge distillation with generative cific emitter identification and cyberspace security.
adversarial networks, in: Proceedings of the 32nd International Conference on
Neural Information Processing Systems (NIPS), Dec. 2018.
[28] T.J. O’Shea, T. Roy, T.C. Clancy, Over the air deep learning based radio signal Haifeng Xu received his B.Eng. degree in School of
classification, IEEE J. Sel. Top. Signal Process. 12 (2018) 168–179. Computer Science and Information Engineering, Hefei
[29] M. Ben-Yosef, D. Weinshall, Gaussian mixture generative adversarial networks University of Technology (HUT), Hefei, China, in 2019.
for diverse datasets, and the unsupervised clustering of images, 2018. He is currently pursuing his M.Eng. in the Department
[30] D. Kim, Y. Choi, J. Han, C. Choi, Y. Kim, Fast adversarial training for semi- of Electronic Engineering and Information Science at
supervised learning, in: International Conference on Learning Representations University of Science and Technology of China (USTC).
(ICLR), 2019. His research topic is deep learning based wireless
communication and mobile computing.
Xiaodong Xu received his B.Eng. degree and Ph.D.
degree in Electronic and Information Engineering Xiaowei Qin received the B.S. and Ph.D. degrees
from University of Science and Technology of China from the Department of Electrical Engineering and In-
(USTC) in 2000 and 2007, respectively. From 2000 to formation Science, University of Science and Technol-
2001, he served as a Research Assistant at the R&D ogy of China (USTC), Hefei, China, in 2000 and 2008,
center, Konka Telecommunications Technology. Since respectively. Since 2014, he has been a member of
2007, he has been a faculty member with the De- staff in Key Laboratory of Wireless-Optical Communi-
partment of Electronic Engineering and Information cations of Chinese Academy of Sciences at USTC. His
Science, USTC. He is currently working with the CAS research interests include optimization theory, service
Key Laboratory of Wireless-Optical Communications, USTC. The research modeling in future heterogeneous networks, and big
interests include the areas of wireless communications, signal processing, data in mobile communication networks.
wireless artificial intelligence and information-theoretic security.
11