Low Multi-Modal Aerial View Image Challenge Translation From Synthetic Aperture Radar CVPRW 2023 Paper

This CVPR workshop paper is the Open Access version, provided by the Computer Vision Foundation.
Except for this watermark, it is identical to the accepted version;

the final published version of the proceedings is available on IEEE Xplore.
Multi-modal Aerial View Image Challenge: Translation from Synthetic

Aperture Radar to Electro-Optical Domain
Results - PBVS 2023
Spencer Low Oliver Nina

Brigham Young University Air Force Research Laboratory
Provo, Utah Dayton, OH
spencerlow@byu.edu oliver.nina.1@afresearchlab.com
Angel D. Sappa Erik Blasch

ESPOL Polytechnic University, Ecuador Air Force Research Laboratory
Computer Vision Center, Spain Arlington, VA
sappa@ieee.org erik.blasch.1@us.af.mil
Nathan Inkawhich
Air Force Research Laboratory
Rome, NY
nathan.inkawhich@us.af.mil
Abstract hensive understanding of an environment or system, has

been a longstanding concept [2, 10]. The earliest known
This paper unveils the discoveries and outcomes of the example of sensor fusion dates back to the 1930s, when
inaugural iteration of the Multi-modal Aerial View Image Robert Watson-Watt pioneered the first practical radar sys-
Challenge (MAVIC) aimed at image translation. The pri- tem that fused acoustic and radar data to facilitate advanced
mary objective of this competition is to stimulate research decision-making [6].
efforts towards the development of models capable of trans- While the Multi-modal Aerial View Object Classifica-
lating co-aligned images between multiple modalities. To tion (MAVOC) challenge primarily deals with fusing data
accomplish the task of image translation, the competition from the Synthetic Aperture Radar (SAR) and Electro-
utilizes images obtained from both synthetic aperture radar Optical (EO) modalities [13], the Multi-modal Aerial View
(SAR) and electro-optical (EO) sources. Specifically, the Image Challenge (MAVIC) focuses on the conversion of
challenge centers on the translation from the SAR modality data from one modality to another. A data conversion ap-
to the EO modality, an area of research that has garnered proach aims to leverage the unique advantages of both SAR
attention. The inaugural challenge demonstrates the fea- and EO sensors while mitigating their limitations. Notably,
sibility of the task. The dataset utilized in this challenge is SAR sensors are capable of operating in any lighting con-
derived from the UNIfied COincident Optical and Radar for ditions due to their self-illuminating nature and can even
recognitioN (UNICORN) dataset. We introduce an new ver- penetrate through atmospheric obstructions like clouds and
sion of the UNICORN dataset that is focused on enabling vegetation, which EO sensors are subject to. However, in-
the sensor translation task. Performance evaluation is con- terpreting SAR images can be challenging and requires ex-
ducted using a combination of measures to ensure high fi- pert knowledge and specialized algorithms [14]. Addition-
delity and high accuracy translations. ally, the scarcity of publicly available SAR data presents an
obstacle to creating robust deep learning training sets. Con-
version of SAR images into EO images thus presents an
1. Introduction opportunity to address these issues.
The problem addressed by the challenge shares some
The practice of sensor fusion, which involves integrat- similarities with those approaches proposed in the literature
ing data from multiple sensors to provide a more compre- for gray scale / near infrared (NIR) / thermal image col-
515
Figure 1. Overview of IR2VI architecture (illustration from [12]).
orization, or color transfer functions (e.g., [22], [16], [28], proaches could be followed for converting SAR to EO
[3], just to mention a few). These problems are generally modality, and although similarities exist with the methods
tackled through the use of Generative Adversarial Networks mentioned above, the particularity of the problem warrants
(GANs) [9], which allows the transformation of informa- a deep study on the advantages and drawback of each of the
tion between domains. Most GAN based approaches have many possibilities evaluated with quantitative and qualita-
focused on supervised contexts, where a paired of correctly tive metrics.
registered data are provided. The unpaired problem, which The development of models that can translate between
is more challenging, could be tackled by a GAN architec- sensors of different modalities can enable the utilization
ture in the unsupervised context under a cyclic structure of established algorithms. For example, in automatic tar-
(CycleGAN) [32]. CycleGAN learns to map images from get recognition (ATR) tasks, models are often trained on
one domain (source domain) onto another domain (target one modality and tested on other modalities. By translating
domain) when paired images are unavailable [24]. This SAR images to EO images, traditional EO models can be
functionality makes models appropriate for image to image used on SAR data.
translation in the context of unsupervised learning. More Image translation has been popular for image anal-
recently, diffusion models have been proposed giving supe- ysis covering iso-domain modalities (visual-to-visual),
rior results than state-of-the-art generative models [5]. Un- between-domain (infrared-to-visual) and across-domain
fortunately, the main limitation with these approaches lie on (SAR-visual) imagery. The concept of image translation
the large amount of resources required for their training. was popularized in 2017 as demonstrated by Isola et al. [9]
Moreover, it is worth noting that generative models are for photo generation and object semantic segmentation over
susceptible to producing hallucinations, which refer to out- a variety of experiments using conditional generative adver-
puts that deviate from the original source information. Such sarial networks (cGAN):
occurrences can be especially consequential in sensor trans-
lation tasks, where the preservation of information between • Semantic labels ↔ photo
modalities is crucial. Therefore, the central aim of the • Architectural labels → photo
MAVIC challenge is to facilitate the development of reli-
able translation models that are capable of producing ex- • Map ↔ aerial photo
plainable, interpretable, and trustworthy outputs.
• Grayscale → color photos
The process of converting SAR images to EO images
is not a trivial task, as it poses several challenges related • Edges → photo
to non-collocated sensor collections, pixel intensity asso- • Sketch → photo
ciation, image size, ground sampling distance (GSD), and
image noise differences, as noted in [30]. Different ap- • Day → night
516
Figure 2. Example of colorization task (illustration from [11]).
Figure 3. Overview of WGGAN architecture (illustration from [27]).
Many researchers extended theses types of methods on adds semantic visible information and object shape infor-
the same data sets such as a combination of the condi- mation to the original thermal images while CycleGAN and
tional variational autoencoder GAN and conditional latent UNIT were not able to do cross-modality image translation.
regressor GAN called the BicyleGAN [31]. Additionally, The StarGAN translated images lack texture information,
using the same datasets, another popular method is the Du- however, the blur shape information did support the VI ob-
alGAN that focuses on training over two sets of unlabeled ject detector.
images from two domains [26]. Building on these ideas Following up with mage colorization using the No-
included within-domain and cross-domain methods for sea- reference Image Quality Evaluation metric, a texture net
sonal changes such as colorization and object variation us- was shown superior [11] (Figure 2) and a WGGAN: A
ing the Multimodal Unsupervised Image-to-image Transla- Wavelet-Guided Generative Adversarial Network for Ther-
tion (MUNIT) framework [8]. Using only a single domain mal Image Translation [27] (Figure 3) further improved the
model, the STARGAN can do image transition to many dif- analysis. Key to these approaches was the features used in
ferent domains [4]. the GAN to support multimodal image translation.
A common use of the image-to-image translation sup-
ports a method of transfer learning conducting domain 2. Challenge
adaptation. The goal is to extract domain agnostic features,
then reconstruct a domain specific image with cycle con- The 2023 MAVIC challenge is held jointly with the Per-
sistency, and predict labels from the agnostic features to si- ception Beyond the Visible Spectrum (PBVS) workshop
multaneously learn from the source domain and adapt to the and is a complement to the MAVOC challenge. The MAVIC
target domain [15]. Additional methods have been applied challenge is designed to facilitate innovate approaches in
to medical imagery as MEDGAN [1]. Review papers high- multi-modal sensor translation. Participants are evaluated
light the different image translation methods developed in on using a weighted average of the L2, LPIPS [29], and
recent years [17]. FID [7] score. The challenge centers on the advancement
Among these developments, there was consideration for of multi-modality translation networks. Participating teams
image-to-image translation for EO and infrared data. Liu are provided with a collection of image pairs, consisting
et al. [12] utilized the above methods to develop the IR2VI of SAR and EO modalities, and are tasked with perform-
method and compared to the CycleGAN, DUALGAN, and ing image translation from one modality to the other. Upon
STARGAN (Figure 1). Results demonstrate that the IR2VI completion, the teams’ generated outputs are evaluated on
517
Figure 4. Example of failed translation from SAR to EO (order
= SAR input, translation, ground truth). We draw attention to the (a) Example 1
aircraft in the image. The translated image illustrates an example
of the generative network hallucinating.
a separate test set that was previously withheld. The perfor-

mance of the teams is subsequently monitored and recorded.
Emphasis is placed on generating high quality translations
with an absence of hallucinations. Figure 4 illustrates an
example of a failed translation that contains hallucinations.
The manuscript is organized as follows. Section 2 pro-
vides an introduction to the challenge dataset, evaluation (b) Example 2
metrics and competition phases. Section 3 summarizes
the results obtained by different teams. Then, Section 4
presents a short description of the top approaches evalu-
ated from submissions. The conclusion is presented in Sec-
tion 5, followed by an appendix containing information on
the teams.
2.1. Dataset
The present dataset is derived from the UNICORN
dataset. Unlike the existing UNICORN dataset, which was (c) Example 3
annotated with respect to target chips, the current dataset is
constructed by dividing co-aligned scenes into 256 × 256
patches. Figure 5 provides examples of these image pairs.
These patches form image pairs that are used for training.
The allocation of training and validation sets is performed
according to the protocol presented in Table 1.
Table 1. Details of the UNICORN dataset used for training, vali-

dation, and testing.
Modality # Train # Val # Test (d) Example 4
SAR 68,151 80 3,586 Figure 5. Examples of (SAR, EO) image pairs pulled from the
EO 68,151 80 3,586 training set. The EO image is on the left while the SAR image is
on the right. These are 256 × 256 pixels. This shows examples of
2.2. Evaluation buildings, cars, and trees. These have been aligned and re-sampled
to ensure consistent size.
The submissions are evaluated using an average of three
metrics:
The selection of the three evaluation metrics serves the pur-
1. LPIPS: Learned Perceptual Image Patch Similarity pose of promoting the production of high-fidelity trans-
2. FID: Frechet Inception Distance lations. Specifically, the L2 metric is incorporated into
the evaluation process to mitigate against the occurrence
3. L2: Pixel-wise L2 norm of generative hallucinations. By enforcing pixel-wise L2
518
norms, the L2 metric effectively preserves the overall struc- Table 2. Top Performing Teams in Competition
ture of the target image. However, the L2 metric has the
drawback of prioritizing low-frequency details while elim- Rank Team Total ↓ LPIPS ↓ FID ↓ L2 ↓
inating high-frequency ones. The combined utilization of 1 USTC-IAT-United 0.09 0.25 0.02 0.01
the LPIPS and FID metrics complements the L2 metric by 2 pokemon 0.14 0.35 0.04 0.01
ensuring the presence of high-resolution details in the gen- 3 wangzhiyu918 0.14 0.38 0.02 0.01
erated images. Additionally, these metrics function to en- 4 ngthien 0.18 0.43 0.10 0.01
sure that the generated images are well-aligned with the tar- 5 Wizard001 0.26 0.50 0.27 0.02
6 hanhai 0.30 0.30 0.59 0.01
get image domain, thus contributing to the overall accuracy
7 u7355608 0.33 0.54 0.43 0.02
and effectiveness of the translation process. 8 jsyoon 0.33 0.46 0.53 0.01
The LPIPS metric is constructed based on the VGG-16
architecture [21]. LPIPS is a technique that leverages deep
feature representations of two images to compute a simi- 4. Methods
larity metric that closely approximates human perceptual
judgments. The FID metric, on the other hand, is estab- This section briefly summarizes the approaches used by
lished using a pre-trained InceptionV3 [23] network, and its the teams that submitted their models and documentation
computation employs the feature layer 64. The FID metric for prize consideration. Not all teams submitted their meth-
provides a measure of the dissimilarity between two proba- ods and are subsequently absent from this paper. We exam-
bility distributions. When applied to an activation layer of ine the submitted methods from the top teams. This section
a neural network, it enables an effective comparison of the consists of edited summaries submitted by each team.
feature spaces of two images. The FID is calculated as:
4.1. Rank 1: USTC-IAT-United
FID = |µ − µω | + tr (Σ + Σω − 2 (ΣΣω )) . (1)
Team USTC-IAT-United proposed a SAR2EO frame-
where µ corresponds to the mean, “tr” is the trace, and Σ work, which is capable of converting SAR images into EO
corresponds to the covariances. The subscript ω indicates a images. For the characteristics of SAR images, a SAR im-
fake or generated image [20]. age pre-processing module is proposed to process SAR im-
In order to consolidate the three metrics, a normalization ages. This module proves effective in experiments. They
procedure is executed to ensure that each metric is scaled also compared pix2pix [9], pix2pixHD [25], SPADE [18],
between the range of 0 and 1. The normalization methodol- UNIT DDPM [19] and other methods, and finally, chose
ogy differs among the metrics. In the case of the L2 norm, pix2pixHD as their base model through experiments. Fig-
an image normalization technique is employed to restrict ure 6 shows the flow chart of the proposed final solution for
the values of the input images being compared between the the competition.
range of 0 and 1. For the LPIPS metric, the normalization In the training phase, they first pass the SAR images
process necessitates the scaling of the output weights. Fi- through a data pre-processing module, and then generate the
nally, the FID metric is normalized via the application of a low-precision output after going through the low-precision
weighted arctan activation function. generative model model, and then discriminate the training
data and the low-precision model through the discrimina-
2
π arctan(FID) + LPIPS + L2 tor. The low-precision image and training data are fed into
Final Score = (2) the high-precision generator together to generate the high-
3
precision output, which is the final result. To make the fi-
2.3. Challenge Phases nal result more high-definition, the output and training data
The challenge began January 11, 2023, and the test data continue to be supervised by the discriminator. Similarly
was released March 1, 2023. The testing phase ended on the test set part goes through the same pre-processing mod-
March 7, 2023 with team submissions finalized. ule and goes through two low-precision and high-precision
generators to get the final EO results. Finally their proposed
3. Challenge Results SAR2EO framework achieved better results and won the
first place in the competition.
The challenge results are summarized in this section. On the whole, their proposed methods is made of four
This challenge had 52 teams participate with 5 teams sub- optimized parts to generate high-quality high-definition
mitting results during the development phase and 8 teams large images:
submitting results in the testing phase, see Table 2. Result-
ing submissions from top three teams can be seen in Fig- 1. The generator is upgraded from U-Net to a multi-level
ure 8. generator (coarse-to-fine generator)
519
Data
Training
processing
Data input
module
Low Resolution
D1 Real / Fake?
Data Low Low_res

Training
processing G1 Resolution Discriminator
Data Input
module Output
Low_res
Low Resolution
Generator
High Resolution
G2
Output
Training Data processing High_res

Data Input module Generator D2 Real / Fake?
High Resolution
High_res
Training Data processing Discriminator
Data Input module
High Resolution
Figure 6. Overview of team USTC-IAT-United’s model. It utilizes a four stage process to produce high quality translated images.
2. The discriminator is upgraded from patch GAN to a

multi-scale discriminator (multi-scale discriminator)
3. The matching loss and content loss based on discrimi-
nator features are added to the optimization objective
4. The proposed data pre-processing module is effective
for SAR images.
Based on the experimental results, the proposed scheme
demonstrates the ability to generate images of compara-
tively high quality on the test set. Furthermore, its perfor-
mance is favorably rated upon human visual inspection. Ad-
ditionally, the USTC-IAT-United’s image generation model
exhibits a relatively fast processing speed.
Figure 7. Illustration of the architecture proposed by the
4.2. Rank 3: wangzhiyu918 wangzhiyu91 team.
Team wangzhiyu918 developed a SAR-to-Optical im- 5. Conclusion

age translation network based on the pix2pix [9], which is
a milestone method for applying GAN to image-to-image The 2023 edition of the MAVIC competition was orga-
translation. Figure 7 overviews their proposed method. nized with the goal of promoting research in sensor transla-
They adopt a U-Net-based network as their generator, which tion across multiple modalities. The inaugural competition
takes 256 × 256 SAR images as inputs. The discriminator successfully demonstrated the feasibility of devising effec-
uses the patchGAN structure to distinguish whether an in- tive translation models, with participating teams submitting
put image is real or fake. It can classify whether overlapping methods that achieved notable success. The development of
patches are real or fake. Such a patch-level discriminator ar- such models holds significant potential for facilitating the
chitecture has fewer parameters than a full-image discrim- integration of SAR imagery into traditional EO algorithms,
inator and can work on arbitrarily-sized images in a fully thereby enhancing their overall utility. Furthermore, these
convolutional fashion. They train their model with vanilla models have the potential to enable the accomplishment of a
L1-Norm loss and Binary Cross Entropy (BCE) classifica- range of important tasks, including automatic target recog-
tion loss. nition (ATR), among others.
520
SAR Input Ground Truth USTC pokemon wangzhiyu
Figure 8. Comparison of model inputs, ground truth, and outputs from top-three teams.
521
Acknowledgements [14] Uttam K. Majumder, Erik P. Blasch, and David A. Garren.
Deep learning for radar and communications automatic tar-
We would like to thank Bob Lee for providing logistic get recognition. Microwave Journal, 63(10):118, 10 2020.
support for the 2023 PBVS MAVOC Challenge. We would Copyright - Copyright Horizon House Publications, Inc. Oct
also like to thank Angel Wheelwright for helping run the 2020; Last updated - 2022-10-19. 1
competition. [15] Zak Murez, Soheil Kolouri, David J. Kriegman, Ravi Ra-
mamoorthi, and Kyungnam Kim. Image to image translation
References for domain adaptation. CoRR, abs/1712.00479, 2017. 3
[16] Miguel Oliveira, Angel Domingo Sappa, and Vitor Santos.
[1] Karim Armanious, Chenming Jiang, Marc Fischer, Thomas A probabilistic approach for color correction in image mo-
Küstner, Tobias Hepp, Konstantin Nikolaou, Sergios Gatidis, saicking applications. IEEE Transactions on image Process-
and Bin Yang. Medgan: Medical image translation using ing, 24(2):508–523, 2014. 2
gans. Computerized Medical Imaging and Graphics, 79, Jan.
[17] Yingxue Pang, Jianxin Lin, Tao Qin, and Zhibo Chen.
2020. 3
Image-to-image translation: Methods and applications.
[2] Erik Blasch, Eloi Bosse, and Dale A Lambert. High-level
IEEE Transactions on Multimedia, 24:3859–3881, 2022. 3
Information Fusion Management and System Design. Artech
House, 2012. 1 [18] Taesung Park, Ming-Yu Liu, Ting-Chun Wang, and Jun-Yan
Zhu. Semantic image synthesis with spatially-adaptive nor-
[3] Zezhou Cheng, Qingxiong Yang, and Bin Sheng. Deep col-
malization. In Proceedings of the IEEE Conference on Com-
orization. In Proceedings of the IEEE international confer-
puter Vision and Pattern Recognition, 2019. 5
ence on computer vision, pages 415–423, 2015. 2
[19] Hiroshi Sasaki, Chris G. Willcocks, and Toby P. Breckon.
[4] Yunjey Choi, Min-Je Choi, Munyoung Kim, Jung-Woo Ha,
UNIT-DDPM: unpaired image translation with denoising
Sunghun Kim, and Jaegul Choo. Stargan: Unified genera-
diffusion probabilistic models. CoRR, abs/2104.05358,
tive adversarial networks for multi-domain image-to-image
2021. 5
translation. CoRR, abs/1711.09020, 2017. 3
[20] Maximilian Seitzer. pytorch-fid: FID Score for PyTorch.
[5] Prafulla Dhariwal and Alexander Nichol. Diffusion models
https://github.com/mseitzer/pytorch-fid,
beat gans on image synthesis. Advances in Neural Informa-
August 2020. Version 0.3.0. 5
tion Processing Systems, 34:8780–8794, 2021. 2
[6] Ethw. Robert watson-watt, Feb 2016. 1 [21] Karen Simonyan and Andrew Zisserman. Very deep convo-
lutional networks for large-scale image recognition. In In-
[7] Martin Heusel, Hubert Ramsauer, Thomas Unterthiner,
ternational Conference on Learning Representations, 2015.
Bernhard Nessler, Günter Klambauer, and Sepp Hochreiter.
5
Gans trained by a two time-scale update rule converge to a
nash equilibrium. CoRR, abs/1706.08500, 2017. 3 [22] Patricia L Suárez, Angel D Sappa, Boris X Vintimilla, and
Riad I Hammoud. Near infrared imagery colorization. In
[8] Xun Huang, Ming-Yu Liu, Serge J. Belongie, and Jan
2018 25th IEEE International Conference on Image Process-
Kautz. Multimodal unsupervised image-to-image transla-
ing (ICIP), pages 2237–2241. IEEE, 2018. 2
tion. CoRR, abs/1804.04732, 2018. 3
[9] Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A [23] Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe,
Efros. Image-to-image translation with conditional adver- Jonathon Shlens, and Zbigniew Wojna. Rethinking
sarial networks. In Proceedings of the IEEE conference on the inception architecture for computer vision. CoRR,
computer vision and pattern recognition, pages 1125–1134, abs/1512.00567, 2015. 5
2017. 2, 5, 6 [24] Harrish Thasarathan and Mehran Ebrahimi. Artist-
[10] Martin II Liggins, David Hall, and James Llinas. Hand- guided semiautomatic animation colorization. CoRR,
book of Multisensor Data Fusion: theory and practice. CRC abs/2006.13717, 2020. 2
Press, 2007. 1 [25] Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu, Andrew Tao,
[11] Shuo Liu, Mingliang Gao, Vijay John, Zheng Liu, and Erik Jan Kautz, and Bryan Catanzaro. High-resolution image syn-
Blasch. Deep learning thermal image translation for night thesis and semantic manipulation with conditional gans. In
vision perception. ACM Transactions on Intelligent Systems Proceedings of the IEEE Conference on Computer Vision
and Technology, 12:1–18, 12 2020. 3 and Pattern Recognition, 2018. 5
[12] Shuo Liu, Vijay John, Erik Blasch, Zheng Liu, and Ying [26] Zili Yi, Hao Zhang, Ping Tan, and Minglun Gong. Dualgan:
Huang. Ir2vi: Enhanced night environmental perception by Unsupervised dual learning for image-to-image translation.
unsupervised thermal image translation. In IEEE/CVF Con- CoRR, abs/1704.02510, 2017. 3
ference on Computer Vision and Pattern Recognition Work- [27] Ran Zhang, Junchi Bin, Zheng Liu, and Erik Blasch. Wggan:
shops (CVPRW), pages 1234–12347, 06 2018. 2, 3 A wavelet-guided generative adversarial network for thermal
[13] Spencer Low, Oliver Nina, Angel D. Sappa, Erik Blasch, and image translation. In Generative Adversarial Networks for
Nathan Inkawhich. Multi-modal aerial view object classifi- Image-to-Image Translation, pages 313–327. Elsevier, 2021.
cation challenge results - pbvs 2022. In 2022 IEEE/CVF 3
Conference on Computer Vision and Pattern Recognition [28] Richard Zhang, Phillip Isola, and Alexei A Efros. Colorful
Workshops (CVPRW), pages 349–357, 2022. 1 image colorization. In Computer Vision–ECCV 2016: 14th
522
European Conference, Amsterdam, The Netherlands, Octo-
ber 11-14, 2016, Proceedings, Part III 14, pages 649–666.
Springer, 2016. 2
[29] Richard Zhang, Phillip Isola, Alexei A Efros, Eli Shechtman,
and Oliver Wang. The unreasonable effectiveness of deep
features as a perceptual metric. In CVPR, 2018. 3
[30] Y. Zheng, E. Blasch, and Z. Liu. Multispectral Image Fusion
and Colorization. Press Monographs. SPIE Press, 2018. 2
[31] Jun-Yan Zhu, Richard Zhang, Deepak Pathak, Trevor Dar-
rell, Alexei A. Efros, Oliver Wang, and Eli Shechtman.
Toward multimodal image-to-image translation. CoRR,
abs/1711.11586, 2017. 3
[32] Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A
Efros. Unpaired image-to-image translation using cycle-
consistent adversarial networks. In Proceedings of the IEEE
international conference on computer vision, pages 2223–
2232, 2017. 2
Appendix A. Teams Information

We acknowledge the participants. We used edited
versions of team submissions for method explanations.
MAVIC 2023 organization team:

Members: Spencer Low, Dr. Oliver Nina, Dr. Angel
Sappa, Dr. Nathan Inkawhich
Affiliation: BYU, AFRL/RY, AFRL/RI, ESPOL
USTC-IAT-United:
Members: Jun Yu, Shenshen Du, Renjie Lu, Pengwei
Li, Guochen Xie, Zhongpeng Cai, Keda Lu, Qing Ling,
Cong Wang, Luyu Qiu, Wei Zheng
Affiliation: University of Science and Technology of
China, Huawei Technologies
wangzhiyu918:
Members: Zhiyu Wang, Xudong Kang, Shutao Li
Affiliation: College of Electrical and Information Engi-
neering, School of Robotics, Hunan University
523

Low Multi-Modal Aerial View Image Challenge Translation From Synthetic Aperture Radar CVPRW 2023 Paper

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Low Multi-Modal Aerial View Image Challenge Translation From Synthetic Aperture Radar CVPRW 2023 Paper

Uploaded by

Copyright:

Available Formats

This CVPR workshop paper is the Open Access version, provided by the Computer Vision Foundation.

Except for this watermark, it is identical to the accepted version;

Multi-modal Aerial View Image Challenge: Translation from Synthetic

Spencer Low Oliver Nina

Angel D. Sappa Erik Blasch

Abstract hensive understanding of an environment or system, has

Figure 3. Overview of WGGAN architecture (illustration from [27]).

a separate test set that was previously withheld. The perfor-

Table 1. Details of the UNICORN dataset used for training, vali-

Modality # Train # Val # Test (d) Example 4

Data Low Low_res

Training Data processing High_res

2. The discriminator is upgraded from patch GAN to a

Team wangzhiyu918 developed a SAR-to-Optical im- 5. Conclusion

Appendix A. Teams Information

MAVIC 2023 organization team:

You might also like