Professional Documents
Culture Documents
ABSTRACT Compared to the general person re-identification (re-id), pedestrian images in cross-resolution
person re-id tasks always have variant resolutions, thus the resolution mismatching problem between
low-resolution (LR) images and high-resolution (HR) images significantly limits the performance of neural
networks. The general solution is to introduce interpolation methods or super-resolution (SR) module to
normalize all the images to the same resolution. Unfortunately, current SR methods can only upscale the
entire image to a certain scale, but fail to unify the resolution of images with different width-to-height ratio.
To solve this problem, we propose a Specific-Resolution-Targeted Super-Resolution (SRTSR) embedded
person re-id method. Instead of using deep CNN-based SR methods, our SRTSR network consists of a fast
and light-weight feature learning module with a self-adaptive Swish (SA-Swish) as its activation function
to improve its performance, and a specific-resolution-targeted upscale module to upscale the images to any
specific resolution. Compared to the traditional upscale module which can only accept the upscale factor
for the entire image, our upscale module can either accept two independent and arbitrary upscale factors for
either height and width, e.g., 2.4× for height and 3.1× for width, or any specific resolution, e.g., 256 × 128,
helping to unify the resolution of images with different width-to-height ratio. Extensive experiments are
conducted and the result shows that the performance of our method is over the SR-based state-of-the-art
person re-id methods on common re-id benchmarks.
features from the LR images, putting the images with differ- methods to normalize all the images to a uniform resolution
ent resolutions to the same feature space. in person re-id tasks.
Besides, there are also a couple of works focusing on SR
method. SR method aims to upscale the LR image to HR III. OUR APPROACH
image, thus it is also the most instinct solution to resolution The main structure of our network can be divided into two
mismatching problem. CSR-GAN [7] focuses on how to use parts, SRTSR sub-network and person re-id sub-network.
SR network to unify the resolutions of pedestrian images The SRTSR network consists of the feature learning module
as close as possible by using the pre-defined threshold to which aims to extract features from LR images, and a upscale
classify the LR images to three levels, and upscale them module which aims to upscale the LR feature to HR image.
in different degrees. SING [1] adds a SR network before As for the person re-id part, to compare with other state-of-
the feature extraction module of person re-id network and the-art methods, we use ResNet-50 as the backbone of re-id
trains the separate networks for LR and HR images jointly. network to evaluate the performance of our SRTSR network.
RIPR [8] proposes a Foreground Focus SR method which In this session, we describe the details of our SRTSR network
only upscales the pedestrian body part, reducing the influence and the combined loss of the entire network.
of the background clutter. However, all the aforementioned
SR methods in person re-id network can only upscale the A. FEATURE LEARNING MODULE FORMULATION
image to an integral scale, and use additional methods, e.g., As a supplementary sub-network, SR module in person re-id
parallel network and GAN, to reduce the influence of resolu- network is required to be time-efficient, thus the feature
tion mismatching problem. learning module of the SR network is expected to be compact
and efficient to extract features from the cross-resolution
images. So, instead of the RDN [23], ESRGAN [24] and other
B. SUPER RESOLUTION
state-of-the-art deep SR networks, we implement our upscale
As we mentioned above, there are couples of works done to module on a light-weight feature learning module.
apply the SR methods onto the person re-id task, but as time We redesign the feature learning module of the SR network
goes by, now we have more choices in SR area [16], [17]. to balance the performance and time efficiency. Skip con-
Since the CNN-based Super Resolution, SRCNN, introduced nection with activation function and the residual in residual
by Dong et al. [18], the deep learning-based SR methods strategy [25], [26] is used to improve the performance, mean-
have already achieved better performance than the previous while the expand and shrink 1×1 convolution layer are added
super-resolution methods. However, the SRCNN needs to at either end of every basic block. As the experiments [21]
upscale the input LR image by interpolation methods before prove that the batch normalization (BN) layer increases the
it feeds into the network, which means there is an extra computational complexity with minor improvement on SR
step compared to other SR methods, wasting considerable tasks, it is also removed from our basic block.
time. Taking a different approach, Shi et al. [19] propose a Like many other SR methods [24], [27], we only process
sub-pixel convolution layer to upscale the image by expand- the Y channel in YCbCr color space. Thus the output channel
ing the channel. And the promoted version of SRCNN, FSR- of the upscale module is 1, and the other channels, i.e., Cb
CNN [20], using the deconvolutional layer at the end of the and Cr, will be added by the skip connection.
network to upscale the image, dramatically increases the time Table 1 illustrates the structure of our feature learning
efficiency. After then, almost all the SR network put their module and some details of our upscale module. Whereas
upscale module at the end of the network. Lim et al. [21] the size of image could be arbitrary, the change of image
propose the EDSR and MDSR, where EDSR network adds size will also influence the flops calculation. So in this table,
the activation function in the skip connection connected the we assume the input image is 256 × 128. As Fig.3 shows,
either end of basic blocks. And Brifman and Elad [22] use the to achieve better performance, we add the LR image upscaled
denoiser to handle SISR problem, which also shows tendency
to a high-quality output and fast processing.
Compared to the blooming development of SR networks,
TABLE 1. The architecture of our SR network with the input image size
the multi-scale upscaling is hardly developed in recent works. 256 × 128 and the upscale factors for height and width are both 2.0×. The
MDSR network firstly introduced the Multi-scale Learning to number of the Basic Blocks is 6.
FIGURE 3. A visual diagram of our entire network. Input images are firstly put into the SR sub-network, passing through the feature extractor and
upscale module, then the output HR images will be put into the re-id network. The loss function is the combined loss of both SR sub-network and
re-id network.
As Fig. 5 shows, the core concept of our upscale module C. OVERALL LOSS FUNCTION
is that the pixel value at (i, j) in the output SR image equals The overall loss function consists of two parts: the MAE loss
to the weighted pixel value at (i0 , j0 ) in input LR image. for SR network,
To establish the relationship between pixels on HR and LR N
images, we introduce a instinct projection equation: 1 X
L1 (Ii − b
Ii ) =
Ii − b
Ii
1 (7)
N
i=1
i j
(i, j) = f (i , j ) = f (
0 0
, ), (3) Here, the Ii and b
Ii refers to the HR images and our output
r0 r1
SR images, respectively. The cross entropy loss for the person
where symbol bc represents the floor function, the r0 denotes re-id network,
upscale factor for height and r1 denotes that for width. The N
floor function here is to avoid fatal errors as the upscale
X
CrossEntropy(FC(fi ) − Li ) = − FC(fi )log(Li ) (8)
factors could be any float numbers. i=1
Person re-id task always requires the image preprocessing
In practice, we first solely train our SR sub-network by
unit before the network to unify the input images to a certain
L1 loss, obtain the pre-trained model, and then train the entire
output resolution, instead of giving a specific upscale factor.
network by the combined loss. Thus, as for the overall loss,
So, during the testing phase, the upscale factors are no longer
we add a tiny weight α to balance the losses. So the combined
the given parameters, but calculated by
loss is,
outH outW
r0 = and r1 = . (4) L = αL1 (Ii − b
Ii ) + CrossEntropy(FC(fi ) − Li ) (9)
inH inW
The outH and outW stand for output height and width, while IV. EXPERIMENTS
inH and inW stand for input height and width. Especially, A. DATASETS AND IMPLEMENTATION DETAILS
different from the testing process, in the training process, We evaluate our person re-id network on four cross-resolution
the network randomly downscale the images to lower reso- re-id benchmarks, VR-Market, VR-MSMT17 [8] and
lutions (1/4 to 1 of its original resolution), and learn how to self-created VR-Market1501-Hard by accepting non-integral
upscale them back. In a short word, the r0 and r1 are randomly resolution proportions, and we remain the original division
generated values in the training phase, while it is determined settings of training and testing sets.
by the target resolution and input resolution in the testing The VR-Market1501 contains 12,936 images of 750 pedes-
phase. trians in the training set, and 19,732 images of 751 pedes-
Given the weight matrix and the projection equation, trians in the testing set. And in VR-MSMT17 dataset,
the pixel value at (i, j) in the output SR image can be written there are 32,621 images of 1041 persons in training set
as the product of corresponding weight for (i, j) and the pixel and 93,820 images in testing set. Compared to their orig-
value at (i0 , j0 ) in the LR feature F LR : inal dataset, all the images in VR-Market1501 and VR-
MSMT17 are downscaled to ( 14 , 1] time of their original
PV SR (i, j) = W (i, j) F LR i0 , j0 scale.
(5)
Particularly, to demonstrate the effectiveness of the meta
In this way, we can find the corresponding pixel for every upscale module, we create a new dataset VR-Market1501-
pixel in the output SR image. Then, we need to generate a Hard, where the images are randomly downscaled from
weight matrix W (i, j) of which the size is exactly the same 1/4 to 1, of which the downscale factor is generated by
with the output SR image. random.uniform(1.01, 4). To keep the images from serve dis-
To achieve better performance, initial value in the weight tortion, the difference of downscale factor for width and for
matrix cannot be idle. As the floor function in fact causes height is restricted to less than or equal to 0.4.
the loss of precision in projection process, and to reduce its All the time-related experiments are run on the com-
influence, we set the initial weight as the offset between the puter with an I7-9700K CPU and a 2070S GPU. To pur-
real value and the ideal value of the correspondence: sue better performance on either image quality assessments
(PSNR/SSIM) and recognition accuracy, the SRTSR network
i i j j will be trained by the corresponding person re-id dataset first,
Winti (i, j) = − , − (6) and then the entire network will be trained together with
r0 r0 r1 r1
the help of the pretrained model of SRTSR network. During
With initial weights, we can create a weight matrix with the the holistic training process, weight of L1 loss for SRTSR
same size of the output SR image. Passing through two fully network is 0.05.
connected (FC) layers, every single element in the weight Due to the previous LR person re-id works only present
matrix becomes an adaptive weight, and by multiplying the Rank-1 and Rank-5 results, in this paper, all the person re-id
adaptive weight and the value of corresponding pixels on LR results are also present in Rank-1 and Rank-5 form. In person
images, the values of pixels on SR images are obtained. re-id results, we fixed the size of the weight matrix while in
image quality assessments (IQA), we use the fixed upscale 1) Upscale images by a certain scale, and 2) Upscale images
factors for height and width. Since the other SR methods, to a certain resolution.
e.g. RCAN and Meta-SR, costs much time once trained with In our ablation study, we set the FSRCNN [20] as the basic
person re-id network, all the IQA(PSNR/SSIM) results are network, which is named as A, and divide our contribution
obtained by training the SR network individually. by three parts, B.the meta upscale module; C. substitute the
3 × 3 convolutional layer to the basic block of our network;
B. EFFECTIVENESS OF SA-SWISH D. the number of basic block. Also, in order to evaluate the
In this session, we compare the effectiveness of three activa- effectiveness of the skip connection, named as E, we test the
tion functions: ReLU, LeakyReLU, P-ReLU, Swish and our network with and without the skip connection, respectively.
SA-Swish. To better solely compare the difference between In the following experiment, we evaluate the effectiveness
the activation functions, we substitute our activation function of each change by objective indexes, i.e., PSNR, SSIM and
to the other, use the traditional upscale method, i.e., by accept- person re-id precision, rank-1 and rank-5. For example, A+B
ing a upscale factor for the entire image, and run the program means FSRCNN with the new upscale layer, but without the
for 20 epochs on Market-1501 dataset. basic block. And it is noticeable that A+B+C means we have
The average time in Table 2 indicates total time for one substituted two convolutional layer to our basic block, and D
entire epoch, and as the result of an additional self-adaptive indicates the number of extra basic blocks besides them.
parameter, SA-Swish and P-ReLU cost longer but acceptable Table 3 illustrates the IQA results, PSNR and SSIM, and
time to achieve better performance than other activation func- the person re-id accuracy, rank-1 and rank-5, of the ablation
tions. However, generally, the activation function will not cost study. The person re-id accuracy results are obtained by train-
too much time, while in this experiment, the processing time ing the joint network on VR-Market1501 dataset, while the
varies a lot. We think it is mainly because the convolution IQA results are obtained by solely training the SR network.
part is so light-weight and fast, thus the time-consuming
proportion of activation functions with multiple self-adaptive TABLE 3. PSNR, SSIM evaluation and time consumption of different
network structures: A. FSRCNN, B. the meta upscale module; C. substitute
parameters increases. the 3 × 3 convolutional layer to the basic block of our network; D. the
extra number of basic blocks; E. the skip connection.
TABLE 2. Comparison between activation functions, the bold denotes the
best result and underline denotes the second.
TABLE 4. Person re-id performance on four cross-resolution re-id solely using bicubic interpolation, and so does the ESRGAN.
benchmarks, the bold font denotes the best result.
Compared to FSRCNN, ESRGAN + bicubic only provides
minor improvements, which is around 0.4% to 2.5%. By con-
trast, our SRTSR achieve the best result, and almost remains
the same level on VR-Market1501 and VR-Market1501-Hard
datasets, which could indicate ours can achieve ideal perfor-
mance on datasets of which the resolutions are in non-integral
proportion.
input image to arbitrary size, our re-id network can achieve V. CONCLUSION
better performance on cross-resolution person re-id dataset We propose a supplementary super resolution network with
compared to other SR network embedded re-id network. a novel activation function, SA-Swish, to upscale the pedes-
Also, it is noticeable that most the SR solutions performs well trian images to the same resolution directly.The proposed
in MLR-CUHK-03 dataset. Especially, our SRTSR outper- light-weight feature learning module is time efficient and
forms the baseline by 15.6%/13.6%, which may indicates accurate, and for each pair of upscale factors, the proposed
that the SR methods can achieve ideal result on png format upscale module generates a weight matrix and a new corre-
images rather than jpg format images. spondence between LR and SR images. the upscale module
Similarly, Table 5 shows the same situation, the SR meth- can generate the SR pedestrian images of arbitrary size.
ods performs better in MLR-CUHK-03 dataset than the other The extensive experiments show that the SA-Swish
two jpg format datasets. In this part, we use our SR network to achieves better performance than Swish, and our SR mod-
generate a clearer and HR dataset from the original one, and ule performs the best especially on datasets with varies of
substitute bicubic interpolation as a new normalization tool. resolutions. And the experiments on datasets of different
And Table 5 proves our SR sub-network is capable to increase formats may indicates that the SR methods could show better
the accuracy of other state-of-the-art person re-id methods on performance on png format images than jpg format images.
cross-resolution datasets. Similarly, with the help of our SR network, the person re-id
Table 5 also shows the result of person re-id precision on network achieves the best Rank-1/Rank-5 results among the
self-created dataset, VR-Market1501-Hard. The increasing other SR method-based are-id networks.
number of varying resolutions does not bring much diffi-
culty for bicubic interpolation, as the precision only drops REFERENCES
2.4%, which could attribute to the difference between the [1] J. Jiao, W. Zheng, A. Wu, X. Zhu, and S. Gong, ‘‘Deep low-resolution
person re-identification,’’ in Proc. AAAI, 2018, pp. 6967–6974.
upscale factors for width and height. One noticeable thing [2] C. Zhang, L. Zhu, S. Zhang, and W. Yu, ‘‘PAC-GAN: An
is that the precision of FSRCNN + bicubic interpolation effective pose augmentation scheme for unsupervised cross-view
drops significantly, and in some cases, it is even lower than person re-identification,’’ Neurocomputing, vol. 387, pp. 22–39,
Apr. 2020.
[3] X. Sun and L. Zheng, ‘‘Dissecting person re-identification from the view-
point of viewpoint,’’ in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recog-
TABLE 5. Person re-id performance of four different re-id methods on nit. (CVPR), Jun. 2019, pp. 608–617.
three cross-resolution benchmarks (Normalized by Bicubic interpolation), [4] X.-Y. Jing, X. Zhu, F. Wu, R. Hu, X. You, Y. Wang, H. Feng, and J.-Y. Yang,
FSRCNN, ESRGAN, and SRTSR, respectively. The bold font denotes the best ‘‘Super-resolution person re-identification with semi-coupled low-rank
result. discriminant dictionary learning,’’ IEEE Trans. Image Process., vol. 26,
no. 3, pp. 1363–1378, Mar. 2017.
[5] X. Li, W.-S. Zheng, X. Wang, T. Xiang, and S. Gong, ‘‘Multi-scale learn-
ing for low-resolution person re-identification,’’ in Proc. IEEE Int. Conf.
Comput. Vis. (ICCV), Dec. 2015, pp. 3765–3773.
[6] S. Zhu, X. Gong, Z. Kuang, and J. Du, ‘‘Partial person re-identification
with two-stream network and reconstruction,’’ Neurocomputing, vol. 398,
pp. 453–459, Jul. 2020.
[7] Z. Wang, M. Ye, F. Yang, X. Bai, and S. Satoh, ‘‘Cascaded SR-GAN for
scale-adaptive low resolution person re-identification,’’ in Proc. 27th Int.
Joint Conf. Artif. Intell., Jul. 2018, pp. 3891–3897.
[8] S. Mao, S. Zhang, and M. Yang, ‘‘Resolution-invariant person re-
identification,’’ in Proc. 28th Int. Joint Conf. Artif. Intell., Aug. 2019,
pp. 883–889.
[9] Y. Ha, J. Tian, Q. Miao, Q. Yang, J. Guo, and R. Jiang, ‘‘Part-
based enhanced super resolution network for low-resolution
person re-identification,’’ IEEE Access, vol. 8, pp. 57594–57605,
2020.
[10] D. S. Cheng, M. Cristani, M. Stoppa, L. Bazzani, and V. Murino, ‘‘Custom
pictorial structures for re-identification,’’ in Proc. Brit. Mach. Vis. Conf.,
2011, p. 6.
[11] Y. Liu, Y. Yang, Y. Shu, T. Zhou, J. Luo, and X. Liu, ‘‘Super-resolution
ultrasound imaging by sparse Bayesian learning method,’’ IEEE Access,
vol. 7, pp. 47197–47205, 2019.
[12] T. Xiao, H. Li, W. Ouyang, and X. Wang, ‘‘Learning deep feature rep- [32] L. Zheng, L. Shen, L. Tian, S. Wang, J. Wang, and Q. Tian, ‘‘Scalable
resentations with domain guided dropout for person re-identification,’’ person re-identification: A benchmark,’’ in Proc. IEEE Int. Conf. Comput.
in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2016, Vis. (ICCV), Dec. 2015, pp. 1116–1124.
pp. 1249–1258. [33] H. Luo, Y. Gu, X. Liao, S. Lai, and W. Jiang, ‘‘Bag of tricks and a
[13] X. Hu, H. Mu, X. Zhang, Z. Wang, T. Tan, and J. Sun, ‘‘Meta-SR: A strong baseline for deep person re-identification,’’ in Proc. IEEE/CVF
magnification-arbitrary network for super-resolution,’’ in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. Workshops (CVPRW), Jun. 2019,
Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2019, pp. 1575–1584. pp. 1487–1495.
[14] P. Ramachandran, B. Zoph, and Q. V. Le, ‘‘Searching for activation [34] T. Chen, S. Ding, J. Xie, Y. Yuan, W. Chen, Y. Yang, Z. Ren, and Z. Wang,
functions,’’ 2017, arXiv:1710.05941. [Online]. Available: http://arxiv.org/ ‘‘ABD-Net: Attentive but diverse person re-identification,’’ in Proc.
abs/1710.05941 IEEE/CVF Int. Conf. Comput. Vis. (ICCV), Oct. 2019, pp. 8350–8360.
[15] Y.-J. Li, Y.-C. Chen, Y.-Y. Lin, X. Du, and Y.-C.-F. Wang, ‘‘Recover [35] M. Ye, J. Shen, G. Lin, T. Xiang, L. Shao, and S. C. H. Hoi, ‘‘Deep
and identify: A generative dual model for cross-resolution person re- learning for person re-identification: A survey and outlook,’’ 2020,
identification,’’ in Proc. IEEE/CVF Int. Conf. Comput. Vis. (ICCV), arXiv:2001.04193. [Online]. Available: http://arxiv.org/abs/2001.04193
Oct. 2019, pp. 8090–8099.
[16] Y. Zhang, Q. Fan, F. Bao, Y. Liu, and C. Zhang, ‘‘Single-image super-
resolution based on rational fractal interpolation,’’ IEEE Trans. Image
Process., vol. 27, no. 8, pp. 3782–3797, Aug. 2018.
[17] T.-A. Song, S. R. Chowdhury, F. Yang, and J. Dutta, ‘‘PET image super-
resolution using generative adversarial networks,’’ Neural Netw., vol. 125,
pp. 83–91, May 2020. XIAOQI WANG received the B.E. degree in com-
[18] C. Dong, C. Loy, K. He, and X. Tang, ‘‘Learning a deep convolutional net-
munications engineering from Xidian University,
work for image super-resolution,’’ in Proc. ECCV, Sep. 2014, pp. 184–199.
Xi’an, China, in 2019, where he is currently
[19] W. Shi, J. Caballero, F. Huszar, J. Totz, A. P. Aitken, R. Bishop,
D. Rueckert, and Z. Wang, ‘‘Real-time single image and video super- pursuing the M.S. degree in communication and
resolution using an efficient sub-pixel convolutional neural network,’’ in information system. His research interests include
Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2016, deep learning, image processing, and person
pp. 1874–1883. re-identification.
[20] C. Dong, C. Loy, and X. Tang, ‘‘Accelerating the super-resolution convo-
lutional neural network,’’ in Proc. ECCV, Oct. 2016, pp. 391–407.
[21] B. Lim, S. Son, H. Kim, S. Nah, and K. M. Lee, ‘‘Enhanced deep residual
networks for single image super-resolution,’’ in Proc. IEEE Conf. Comput.
Vis. Pattern Recognit. Workshops (CVPRW), Jul. 2017, pp. 136–144.
[22] A. Brifman, Y. Romano, and M. Elad, ‘‘Turning a denoiser into a super-
resolver using plug and play priors,’’ in Proc. IEEE Int. Conf. Image XI YANG (Member, IEEE) received the B.Eng.
Process. (ICIP), Sep. 2016, pp. 1404–1408. degree in electronic information engineering and
[23] Y. Zhang, Y. Tian, Y. Kong, B. Zhong, and Y. Fu, ‘‘Residual dense net-
the Ph.D. degree in pattern recognition and intel-
work for image super-resolution,’’ in Proc. IEEE/CVF Conf. Comput. Vis.
ligence system from Xidian University, Xi’an,
Pattern Recognit., Jun. 2018, pp. 2472–2481.
[24] X. Wang, K. Yu, S. Wu, J. Gu, Y. Liu, C. Dong, Y. Qiao, and C. Chen,
China, in 2010 and 2015, respectively. From
‘‘ESRGAN: Enhanced super-resolution generative adversarial networks,’’ 2013 to 2014, she was a Visiting Ph.D. Student
in Proc. ECCV Workshops, Sep. 2018, pp. 63–79. with the Department of Computer Science, The
[25] Y. Zhang, K. Li, K. Li, L. Wang, B. Zhong, and Y. Fu, ‘‘Image super- University of Texas at San Antonio, San Antonio,
resolution using very deep residual channel attention networks,’’ in Proc. TX, USA. In 2015, she joined the State Key Labo-
ECCV, Sep. 2018, pp. 286–301. ratory of Integrated Services Networks, School of
[26] B. Singh, D. Toshniwal, and S. K. Allur, ‘‘Shunt connection: An intelligent Telecommunications Engineering, Xidian University, where she is currently
skipping of contiguous blocks for optimizing MobileNet-V2,’’ Neural an Associate Professor in communications and information systems. Her
Netw., vol. 118, pp. 192–203, Oct. 2019. current research interests include image/video processing, computer vision,
[27] H. M. Kasem, M. M. Selim, E. M. Mohamed, and A. H. Hussein, ‘‘DRCS- and multimedia information retrieval.
SR: Deep robust compressed sensing for single image super-resolution,’’
IEEE Access, vol. 8, pp. 170618–170634, 2020.
[28] J. Yu, Y. Fan, J. Yang, N. Xu, Z. Wang, X. Wang, and T. Huang,
‘‘Wide activation for efficient and accurate image super-resolution,’’ 2018,
arXiv:1808.08718. [Online]. Available: http://arxiv.org/abs/1808.08718
DONG YANG received the B.Eng. degree in
[29] C. Ledig, L. Theis, F. Huszar, J. Caballero, A. Cunningham, A. Acosta,
A. Aitken, A. Tejani, J. Totz, Z. Wang, and W. Shi, ‘‘Photo-realistic
electronic information engineering and the Ph.D.
single image super-resolution using a generative adversarial network,’’ degree in signal processing from Xidian Univer-
in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jul. 2017, sity, Xi’an, China, in 2010 and 2015, respectively.
pp. 4681–4690. He is currently with the Xi’an Institute of Space
[30] Z. Hui, X. Gao, Y. Yang, and X. Wang, ‘‘Lightweight image super- Radio Technology. His current research interests
resolution with information multi-distillation network,’’ in Proc. 27th ACM include image processing and spaceborne radar
Int. Conf. Multimedia, Oct. 2019, pp. 2024–2032. systems.
[31] B.-C. Yang, ‘‘Super resolution using dual path connections,’’ in Proc. 27th
ACM Int. Conf. Multimedia, Oct. 2019, pp. 1552–1560.