Professional Documents
Culture Documents
has a higher image resolution and can depict the retinal foveal ultrasound image structures and details by cascading a U-
avascular zone (FAZ) and surrounding capillaries more clearly, Net network at the front end of the generator and a modified
as shown in Fig. 1 (B) and (D). By contrast, as shown in GAN network at the back end. The network combined multi-
Fig. 1 (C) and (E), the 6 × 6 mm2 scan has a larger FOV but scale global residual and local residual learning, which can
with lower scanning density, which results in lower imaging effectively capture high-frequency details. These successes
visibility to present small capillaries [7]. This is due to the have motivated our investigation of OCTA super-resolution
fact that a high scanning/sampling frequency is hardly to be methods using CNNs. Hence, in this paper, we aim to enhance
achieved in clinical practice by considering its longer inter- the image quality of 6 × 6 mm2 OCTA via learning-based
frame and total imaging time at a larger FOV. Artifacts caused super-resolution method for alleviating the resolution loss
by involuntary movements, blinking and tear film evaporation caused by under-sampling.
will arise during the long acquisition procedure and interfere In general, learning-based super-resolution methods require
with clinical diagnosis [8]. Thus, it has been becoming a pairs of LR-HR images. In practice, it is extremely difficult to
challenging issue to ensure a considerable image resolution obtain a realistic HR image in 6 × 6 mm2 OCTA acquisition.
with a larger FOV. Since 3 × 3 mm2 OCTA images have higher resolution and
In recent years, several works have been established to are also fully contained within the 6 × 6 mm2 images, we
achieve high imaging resolution within larger FOV images can therefore use realistic 3 × 3 mm2 OCTA images as
by improving the optical system directly [9]. Despite the the reference HR images to guide the training of realistic
significant efforts have been invested, the current state-of-the- 6 × 6 mm2 images reconstruction. This strategy requires the
art OCTA acquisition systems still need to make a trade- registration of 3 × 3 mm2 to 6 × 6 mm2 . However, it is
off between image resolution and FOV size [10]. Hence, difficult to achieve perfect registration result, due to the fact
pre-processing techniques such as image enhancement is that local capillary differences caused by eye motion and signal
elaborated to compensate the resolution loss and obviously variations obviously exist between the two acquisitions, as
show more values. For example, Zang et al. [11] proposed shown in Fig. 1. Such discrepancy in the spatial mapping
to minimize image artifacts in order to extend the total between 3×3 mm2 and 6×6 mm2 images has been observed
samplings and thus to obtain high resolution images with in several OCTA repeatability and reproducibility studies [27]–
larger FOV. Uji et al. [12] employed a multiple en face image [30]. Ignoring this domain discrepancy during learning process
averaging method to enhance image quality of OCTA; Tan et could result in over-smoothed reconstruction with less detailed
al. [13] developed a vessel enhancement algorithm based on information. It is a key aspect to reduce the domain gap
a modified Bayesian residual transform, which can improve between 3×3 mm2 and 6×6 mm2 images for high-resolution
the contrast and visibility of vascular features. However, such vascular structures reconstruction.
procedures may require more repeated acquisitions, which in Domain adaptation aims to use labeled source domains
turn increases the total sampling time. to learn models that perform well on unlabeled target do-
Image super-resolution reconstruction algorithms have been mains. Recently, adversarial-based domain adaptation methods
widely applied in natural and medical images [14]–[20], as have been proposed for solving challenging dense prediction
they are not only able to perform HR reconstruction of images tasks [31]–[33]. For example, Tsai et al. [33] proposed an
with the larger FOVs, but also reduce the necessity of long adversarial-based domain adaptation method for semantic seg-
acquisition time. Currently, image super-resolution techniques mentation, which uses the spatial similarity of the source and
can be divided into three types: interpolation-based [14], target domain outputs to reduce the domain differences. By
statistics-based [15]–[17], and learning-based methods [21]– means of appropriate adaptation strategies, models trained on
[23]. Most of image super-resolution techniques are always synthetic datasets have achieved performance comparable to
limited by detailed, realistic textures or small upscaling fac- that of models trained with realistic labeled datasets. [34].
tors, while learning-based methods are capable of recovering Inspired by these adversarial-based domain adaptation meth-
missing high-frequency information from large quantities of ods, we propose a novel framework from a domain adaptation
LR and HR pairs, and thus have achieved significant im- perspective for 6 × 6 mm2 OCTA images super-resolution
provements in super-resolution reconstruction [18]–[20], [24], reconstruction, which can mitigate the difficulties in pairwise
[25]. For example, By introducing Super-Resolution Convo- training of 3 × 3 mm2 and 6 × 6 mm2 images. In the
lutional Neural Network (SRCNN) [18], convolutional neural proposed network, a multi-level super-resolution model is
networks (CNNs) have led to state-of-the-art performance in proposed as a generator for 6 × 6 mm2 images reconstruction,
super-resolution reconstruction of medical images. Umehara and PatchGAN [35] is developed as a discriminator to align
et al. [19] proposed to apply SRCNN on CT images and their the reconstruction results with the features of HR images. In
approach outperforms many of the traditional linear interpo- addition, we introduce a sparse edge-aware loss to optimize
lation methods. For MRI images, Shi et al. [20] developed the vascular structure of the reconstruction results via dynamic
a residual learning-based super-resolution method to solve alignment of edges between the reconstructed HR image and
MRI super-resolution problems. Feng et al. [25] proposed a the reference HR image. Finally, we perform quality evaluation
multi-stage integration network for multi-contrast MRI super- of the super-resolution results on two OCTA sets with vascular
resolution, which explicitly models the dependencies between and FAZ pixel annotations. The experimental results show that
multi-contrast images at different stages. Zhou et al. [26] our proposed method achieves state-of-the-art results in terms
proposed a two-stage GAN to improve the reconstruction of of visual quality and clinical evaluation on two OCTA sets.
3
Fig. 3. Architecture of the proposed SASR network for OCTA image reconstruction. The proposed SASR network mainly consists of multi-level
super-resolution model (MLSR), patch discrimination and sparse edge-aware loss (Lse ). The MLSR is used as the generation model in the SASR,
including (a) backbone network, (b) encoder-decoder network, and (c) up-sampling fusion network.
K
X Fig. 6. Demonstration of sparse result between DLR and ILR . (a)
Ak (x) = 1, 0 ≤ Ak (x) ≤ 1, (3) Synthetic LR image (DLR ). (b) Realistic LR image (ILR ). (c)-(d) Edge
k=1 structures of (a) and (b) extracted by canny operator. (e) Fusion result
of (c) and (d). (f) Sparse results determined by the similarity of each pair
where Wk and Ak represent the weight matrix and attention of patches in (e).
map of the k th kernel, and we set K = 4 in the experiments.
Note that Ak is obtained by using a Squeeze and Excitation
(SE) module [39] to obtain discriminative representation. realistic results. Consequently, each pixel in the final output
More specifically, the SE module is composed of a global feature map represents the probability that the corresponding
average pooling (GAP) layer, and a Multi-Layer Perceptron 70 × 70 patch of the input image is from a real sample.
(MLP) with a ReLU-activated hidden layer, followed by the To this end, the loss function in the adversarial process of
Softmax layer. The output of W (x) is denoted as y, which discriminator and generator is defined as:
is fed into the PixelShuffle [40] layer to obtain the high-
resolution reconstruction results of the same size as IHR . Ladv = E [log (1 − D (G(ILR )))] + E[log D(G(DLR ))] (5)
The PixelShuffle layer first feeds the low-resolution feature
map into a convolutional layer for channel expansion, and where D and G are discriminator and MLSR, respectively.
then performs multi-channel reorganisation to generate a high-
resolution map through period filtering. Compared to other C. Sparse edge-aware loss
upsampling methods, the PixelShuffle is able to increase the
frequency of information extraction to enrich the image detail Unlike the general domain adaptation approach, the inputs
texture. In the training stage, we combine the Mean Squared to our proposed framework are paired synthetic and realistic
Error (MSE) and SSIM as the super-resolution loss Lsr to LR images. Even though ILR and IHR have some discrepancy
optimize the output in this network. The loss can be defined in spatial mapping, their vascular structures are very similar.
as: In order to further improve the local vascular reconstruction of
Lsr = λMSE · LMSE + λSSIM · LSSIM (4) realistic LR images, we propose to use edge similarity loss for
structure optimization. However, since the background noise
where λMSE and λSSIM are set to 1 and 0.5 empirically. affects the extraction of partial vessel structures, optimizing
the overall vessel edges would introduce mislabeling and
B. Patch discrimination thus generate over-smoothed structures. To solve this issue, a
sparse edge-aware loss is designed to adaptively constrain the
To guide the realistic LR-to-HR image reconstruction, we
reconstruction results. Firstly, as shown in Fig. 6, we use the
use a discriminator to assist the generative network to achieve
canny operator [41] to extract the edge structures of the ILR
spatial feature alignment between synthetic and realistic data.
and DLR . Then MSE is employed to compute the distances
In our paper, we use PatchGAN to classify true and false
between the two edge images as follows:
samples for synthetic and real LR image reconstruction results.
Compared with traditional discriminators, PatchGAN makes
judgments based on the patch level rather than the whole d(t) = kC(Dt ) − C(It )k2 (6)
image level. As shown in Fig. 5, the structure of PatchGAN
is composed of 4 × 4 five convolutional layers with a stride where Dt and It denote the tth patches with the size of n × n
of 2 in the first three layers and a stride of 1 in the last two from DLR and ILR with t = 1, 2, · · · , T, respectively. d(t)
layers. Except for the last convolutional layer, the first four denotes the distance of each pair of patches. Then we adopt
convolutional layers are followed by Leaky ReLU layers with a a hard shrinkage operation to promote the sparsity:
slope of 0.2 and batch normalization (BN) layers. The Sigmoid max (dt − λ, 0)
layer is used to obtain the probability score of the output w
bt = (7)
|dt − λ| +
feature map in the last convolutional layer. In the above setting,
the receptive field size of PatchGAN is 70 × 70, which means where max(·, 0) is also known as ReLU activation, and is a
that PatchGAN has a faster inference speed than traditional very small positive scalar. λ is used to screen vascular regions
discriminators, but still can guide the generator to generate of ILR with structures similar to DLR . Finally, we compute
6
Fig. 7. Visual comparison of super-resolution images reconstructed by different methods over two randomly selected images from SURE-O and
SURE-Z sets, respectively. Green rectangles indicate two representative patches for illustration.
the edge distances between realistic LR image reconstruction enhancement with a random rotation from −10◦ to 10◦ was
result I˜HR and label IHR : employed to enlarge the training set. In the sparse edge-aware
b = C(Ht ) − C(H̃t ) loss, the parameter λ in Equation (7) was set to 0.05 by
d(t) (8) counting the average distance between the edges of DLR and
2
ILR . n and T are set to 16 and 36, respectively.
where Ht and H̃t denote the tth patches with the size of n×n
from IHR and I˜HR . Therefore, the sparse edge-aware loss is
defined as: B. Evaluation Metrics
T
For the pairs of realistic OCTA images, it is unreliable to
X
Lse = d(t)
b w bt (9)
t=1 evaluate the image quality improvement with the common
quality measures, due to that the inherent variations of signal
To this end, the total loss function is denoted as:
and noise levels can be generated during the image acquisition
Ltotal = λMSE · LMSE + λSSIM · LSSIM + λadv Ladv + λse Lse at different scales, i.e. 3 × 3 mm2 and 6 × 6 mm2 . Neverthe-
(10) less, the similarity on clinically significant structures such as
where λadv and λse are set as 1 and 0.1 in our paper. blood vessels provides us with extra possibilities for quality
evaluation, i.e., by considering the improvements to FAZ and
IV. E XPERIMENTS vascular segmentations.
A. Implementation details For the synthetic OCTA images, we use peak signal-to-
noise ratio (PSNR) [50] and structural similarity (SSIM) [51]
The proposed method was implemented by the publicly and learned perceptual image patch similarity (LPIPS) [52] to
available Pytorch Library in the Nvidia GeForce TITAN Xp evaluate the reconstruction quality of different super-resolution
GPU. In the training phase, we employed an Adam opti- methods. The PSNR is defined as by:
mizer [42] to optimize the deep model. We used a gradually
PSNR(f, g) = 10 log10 2552 /M SE(f, g)
decreasing learning rate, starting from 0.0001, and a momen- (11)
tum of 0.9. In each iteration, we took a random 96 × 96 patch
M N
of the image for training, and the batch size was set to 8 during 1 XX 2
the training. We trained the network for 300 epochs and the MSE(f, g) = (fij − gij ) (12)
M N i=1 j=1
network reached convergence at around the 150 epoch. The
training lasted approximately 20 and 8 hours on the SURE-O where f and g present reference and reconstructed images,
and SURE-Z datasets, respectively. In addition, online data both of size M × N . The SSIM is defined as:
7
TABLE I
FAZ SEGMENTATION PERFORMANCE BETWEEN THE RECONSTRUCTED AND THE REFERENCE HR IMAGES ON THE SURE-O DATASET.
SURE-O
Methods AUC ↑ Sen ↑ G-mean ↑ Kappa ↑ FDR ↓ Dice ↑ dHD ↓
HR 0.979±0.013 0.930±0.038 0.964±0.020 0.944±0.022 0.036±0.025 0.946±0.021 15.58±23.13
Bicubic [43] 0.945±0.068 0.849±0.164 0.914±0.106 0.854±0.207 0.126±0.226 0.859±0.199 29.28±40.31
ScSR [17] 0.947±0.064 0.856±0.147 0.919±0.092 0.866±0.183 0.110±0.203 0.870±0.177 24.51±30.66
SRCNN [18] 0.939±0.079 0.834±0.170 0.905±0.111 0.854±0.205 0.112±0.224 0.859±0.198 20.89±29.33
EDSR [44] 0.944±0.072 0.849±0.160 0.915±0.101 0.866±0.185 0.106±0.198 0.870±0.179 29.26±31.14
DRLN [45] 0.943±0.075 0.851±0.159 0.916±0.101 0.864±0.193 0.111±0.210 0.869±0.186 23.25±30.63
RDN [37] (baseline) 0.942±0.076 0.846±0.166 0.913±0.107 0.863±0.192 0.110±0.205 0.867±0.186 29.12±30.77
ZSSR [46] 0.949±0.060 0.857±0.152 0.919±0.095 0.869±0.183 0.108±0.199 0.873±0.177 25.20±31.48
RealSR [47] 0.958±0.043 0.879±0.119 0.933±0.073 0.868±0.170 0.128±0.195 0.873±0.164 29.83±36.22
CycleGAN [48] 0.942±0.027 0.852±0.085 0.920±0.051 0.849±0.118 0.142±0.139 0.854±0.114 34.22±32.11
Two-stage GAN [26] 0.947±0.066 0.852±0.165 0.916±0.106 0.866±0.193 0.106±0.206 0.870±0.186 24.80±31.30
HAR-Net [49] 0.942±0.076 0.845±0.165 0.912±0.106 0.861±0.195 0.110±0.211 0.866±0.189 23.47±31.42
SASR (Our Method) 0.961±0.034 0.890±0.093 0.940±0.056 0.886±0.147 0.104±0.177 0.890±0.142 17.58±29.58
TABLE II
FAZ SEGMENTATION PERFORMANCE BETWEEN THE RECONSTRUCTED AND THE REFERENCE HR IMAGES ON THE SURE-Z DATASET.
SURE-Z
Methods AUC ↑ Sen ↑ G-mean ↑ Kappa ↑ FDR ↓ Dice ↑ dHD ↓
HR 0.978±0.057 0.947±0.120 0.970±0.068 0.946±0.084 0.045±0.041 0.947±0.081 28.27±44.70
Bicubic [43] 0.916±0.051 0.763±0.114 0.867±0.069 0.752±0.144 0.232±0.179 0.760±0.139 105.38±93.92
ScSR [17] 0.932±0.039 0.809±0.092 0.895±0.054 0.810±0.109 0.170±0.134 0.816±0.105 69.01±87.04
SRCNN [18] 0.937±0.028 0.817±0.068 0.901±0.039 0.846±0.072 0.109±0.083 0.851±0.069 74.37±39.51
EDSR [44] 0.940±0.033 0.829±0.075 0.908±0.043 0.848±0.083 0.118±0.093 0.854±0.079 63.59±82.17
DRLN [45] 0.938±0.034 0.823±0.078 0.904±0.045 0.839±0.092 0.129±0.108 0.845±0.089 86.30±86.87
RDN [37] (baseline) 0.941±0.031 0.830±0.073 0.908±0.042 0.840±0.090 0.134±0.110 0.846±0.087 69.92±87.87
ZSSR [46] 0.927±0.042 0.797±0.093 0.888±0.055 0.792±0.119 0.192±0.151 0.799±0.114 82.25±88.34
RealSR [47] 0.932±0.043 0.813±0.097 0.896±0.057 0.789±0.121 0.214±0.153 0.794±0.117 86.30±89.07
CycleGAN [48] 0.942±0.040 0.836±0.083 0.912±0.048 0.878±0.073 0.065±0.059 0.882±0.071 49.82±40.89
Two-stage GAN [26] 0.928±0.041 0.799±0.093 0.889±0.055 0.791±0.116 0.194±0.147 0.799±0.112 63.13±85.29
HAR-Net [49] 0.940±0.035 0.829±0.810 0.907±0.047 0.840±0.095 0.134±0.111 0.845±0.091 85.55±85.32
SASR (Our Method) 0.958±0.033 0.889±0.070 0.940±0.039 0.897±0.063 0.086±0.061 0.900±0.060 41.01±42.88
Fig. 8. FAZ segmentation performance on reconstructed OCTA image by different methods over two sample OCTA images selected from SURE-O
and SURE-Z sets. Red represents under-segmentation, and blue indicates over-segmentation.
TABLE III
V ESSEL SEGMENTATION PERFORMANCE BETWEEN THE RECONSTRUCTED AND THE REFERENCE HR IMAGES ON THE SURE-O DATASET.
SURE-O
Methods AUC ↑ Sen ↑ G-mean ↑ Kappa ↑ FDR ↓ Dice ↑
HR 0.891±0.027 0.718±0.067 0.832±0.037 0.726±0.047 0.132±0.048 0.783±0.041
Bicubic [43] 0.788±0.029 0.526±0.054 0.710±0.033 0.553±0.035 0.196±0.075 0.631±0.036
ScSR [17] 0.791±0.033 0.569±0.048 0.735±0.030 0.568±0.044 0.229±0.087 0.651±0.042
SRCNN [18] 0.793±0.033 0.581±.0.044 0.741±0.027 0.571±0.046 0.237±0.090 0.656±0.042
EDSR [44] 0.792±0.032 0.575±0.048 0.738±0.030 0.569±0.048 0.234±0.094 0.653±0.044
DRLN [45] 0.794±0.033 0.579±0.046 0.740±0.029 0.572±0.048 0.233±0.091 0.655±0.044
RDN [37] (baseline) 0.793±0.032 0.579±0.047 0.740±0.029 0.571±0.048 0.235±0.093 0.655±0.044
ZSSR [46] 0.790±0.033 0.574±0.046 0.737±0.029 0.569±0.047 0.233±0.093 0.652±0.044
RealSR [47] 0.784±0.030 0.556±0.047 0.725±0.029 0.554±0.040 0.239±0.084 0.639±0.041
CycleGAN [48] 0.791±0.036 0.589±0.048 0.745±0.030 0.575±0.050 0.241±0.093 0.659±0.047
Two-stage GAN [26] 0.789±0.033 0.578±0.047 0.738±0.029 0.567±0.048 0.239±0.096 0.652±0.045
HAR-Net [49] 0.793±0.033 0.576±0.047 0.738±0.029 0.569±0.047 0.234±0.091 0.653±0.042
SASR (Our Method) 0.807±0.036 0.649±0.049 0.775±0.030 0.594±0.060 0.269±0.093 0.684±0.055
TABLE IV
V ESSEL SEGMENTATION PERFORMANCE BETWEEN THE RECONSTRUCTED HR IMAGES AND THE REFERENCE HR IMAGES ON THE SURE-Z
DATASET.
SURE-Z
Methods AUC ↑ Sen ↑ G-mean ↑ Kappa ↑ FDR ↓ Dice ↑
HR 0.941±0.015 0.796±0.044 0.877±0.024 0.793±0.035 0.081±0.025 0.852±0.028
Bicubic [43] 0.730±0.040 0.509±0.037 0.675±0.028 0.434±0.050 0.306±0.059 0.586±0.040
ScSR [17] 0.733±0.039 0.525±0.040 0.681±0.027 0.436±0.048 0.319±0.059 0.592±0.041
SRCNN [18] 0.731±0.038 0.531±0.044 0.683±0.028 0.434±0.047 0.326±0.056 0.593±0.041
EDSR [44] 0.733±0.040 0.532±0.041 0.683±0.027 0.433±0.049 0.328±0.060 0.593±0.041
DRLN [45] 0.733±0.040 0.528±0.040 0.682±0.028 0.433±0.049 0.325±0.061 0.591±0.042
RDN [37] (baseline) 0.734±0.040 0.526±0.040 0.681±0.027 0.433±0.048 0.322±0.060 0.591±0.040
ZSSR [46] 0.732±0.039 0.522±0.038 0.680±0.026 0.435±0.047 0.317±0.058 0.590±0.039
RealSR [47] 0.734±0.042 0.515±0.041 0.678±0.028 0.438±0.049 0.306±0.056 0.590±0.041
CycleGAN [48] 0.725±0.043 0.536±0.064 0.682±0.038 0.428±0.056 0.337±0.055 0.591±0.055
Two-stage GAN [26] 0.732±0.037 0.522±0.039 0.680±0.027 0.436±0.048 0.316±0.059 0.591±0.040
HAR-Net [49] 0.734±0.041 0.527±0.040 0.680±0.028 0.432±0.049 0.325±0.062 0.590±0.041
SASR (Our Method) 0.747±0.038 0.541±0.049 0.688±0.030 0.442±0.051 0.323±0.051 0.601±0.045
Fig. 9. Vessel segmentation performance on peri-macular region (green circle) of reconstructed OCTA image by different methods over two sample
OCTA images selected from SURE-O and SURE-Z sets.
smaller the LPIPS, the closer the generated image is to the input. To prove the superiority of the proposed method, the
ground truth. following methods were compared:
(1) Traditional methods for natural synthetic images:
C. Performances on realistic low resolution image the Bicubic method [43] and Sparse coding based Super
In this subsection, we evaluated the reconstruction perfor- Resolution (ScSR) [17].
mance when the realistic low resolution image was utilized as (2) Deep learning methods for natural synthetic
9
TABLE V TABLE VI
C OMPARISON OF PARAMETER NUMBERS AND TEST TIMES FOR C OMPARISON OF VLD AND VT IN RECONSTRUCTED IMAGES FROM
DIFFERENT SUPER - RESOLUTION MODELS ON TWO SETS . DIFFERENT METHODS ON TWO OCTA SETS .
TABLE VII
A BLATION STUDIES OF FAZ SEGMENTATION ON BOTH THE SURE-O AND SURE-Z SUBSETS
SURE-O SURE-Z
Methods AUC↑ ACC↑ G-mean↑ Kappa↑ FDR↓ Dice↑ dHD ↓ AUC↑ ACC ↑ G-mean↑ Kappa↑ FDR↓ Dice↑ dHD ↓
RDN [37] (baseline) 0.942 0.846 0.913 0.863 0.110 0.867 29.12 0.941 0.830 0.908 0.840 0.134 0.846 69.92
MLSR 0.945 0.854 0.918 0.867 0.108 0.871 31.01 0.943 0.834 0.910 0.845 0.129 0.850 76.97
DASR 0.955 0.880 0.935 0.903 0.066 0.906 23.57 0.953 0.877 0.934 0.893 0.081 0.897 49.95
SASR (Our Method) 0.961 0.890 0.940 0.886 0.104 0.890 17.58 0.958 0.889 0.940 0.897 0.086 0.900 41.01
TABLE VIII
A BLATION STUDIES OF VESSEL SEGMENTATION ON BOTH THE SURE-O AND SURE-Z SUBSETS
SURE-O SURE-Z
Methods AUC↑ ACC↑ G-mean↑ Kappa↑ FDR↓ Dice↑ AUC↑ ACC↑ G-mean↑ Kappa↑ FDR↓ Dice↑
RDN [37] (baseline) 0.793 0.740 0.571 0.740 0.235 0.655 0.734 0.526 0.681 0.433 0.322 0.591
MLSR 0.793 0.616 0.759 0.582 0.256 0.671 0.735 0.529 0.681 0.432 0.326 0.591
DASR 0.800 0.624 0.762 0.585 0.260 0.673 0.740 0.544 0.687 0.433 0.336 0.597
SASR (Our Method) 0.807 0.649 0.775 0.594 0.269 0.684 0.747 0.541 0.688 0.442 0.323 0.601
show that our method achieves the best performance in terms MLSR in all metrics for the FAZ and vessel segmentation,
of PSNR on both sets, which represents that the fidelity of respectively. It proves that DASR can enhance the clarity and
our method outperforms other methods. The LPIPS metric contrast of the vessels in the reconstructed images, which leads
of our method also presents the highest performance on two to improvement of the FAZ and vessel segmentation results.
OCTA dataset, indicating our results are much closer to the HR 3) Sparse edge-aware loss (SE-loss): Based on the pro-
images in terms of visual characteristics. We adopted a simple posed DASR, we also introduced the SE-loss to further im-
method for image degradation, but the reconstruction effect of prove the reconstruction results. SE-loss is mainly designed to
SURE-O dataset is far inferior to that of SURE-Z data, which optimize the reconstruction results by using the highly similar
is mainly because the image quality of SURE-Z dataset is characteristics of blood vessels in LR and HR images. Fig. 9
relatively higher, and has less background noise. For SURE-Z shows that the proposed SE-loss can alleviate the problem
dataset, all methods achieve comparable performance in terms of vascular discontinuity of OCTA reconstruction results and
of their reconstruction results. Nevertheless, our proposed meanwhile enhance the image quality. The quantification re-
network still achieves the best results in all metrics. When sults of the vessel segmentation in Table VIII also show that
compared with EDSR, the proposed method outperforms it in the SE-loss is beneficial to the precise vessel reconstruction
all metrics on both sets, with a respectively higher PSNR of in LR images. In local regions of high similarity between LR
0.41 dB and 3.36 dB. The LPIPS is also improved by 4.3% and HR images, super-resolution reconstruction is performed
and 3.4%, respectively. In addition, compared with the HAR- by taking advantage of the HR vascular information in a
Net, our method achieves significant improvements in all three supervised manner. While in regions of low similarity, the
metrics, with 0.39 dB and 3.85 dB higher on PSNR, 0.3% reconstruction of LR regions are achieved via unsupervised
and 2.4% higher on SSIM, 4.5% and 3.0% higher on LPIPS, learning. This can be useful to the enhancement of local
respectively. visibility with rich capillary details.
[3] L. Mou, Y. Zhao, L. Chen, J. Cheng, Z. Gu, H. Hao et al., “Cs- [25] C.-M. Feng, H. Fu, S. Yuan, and Y. Xu, “Multi-contrast mri super-
net: Channel and spatial attention network for curvilinear structure resolution via a multi-stage integration network,” in Proc. Int. Conf.
segmentation,” in Proc. Int. Conf. Med. Image Comput. Comput. Assist. Med. Image Comput. Comput. Assist. Intervent. (MICCAI). Springer,
Intervent. (MICCAI), 2019, pp. 721–730. 2021, pp. 140–149.
[4] D. Cabrera DeBuc, G. M. Somfai, and A. Koller, “Retinal microvascular [26] Z. Zhou, Y. Wang, Y. Guo, Y. Qi, and J. Yu, “Image quality improvement
network alterations: potential biomarkers of cerebrovascular and neural of hand-held ultrasound devices with a two-stage generative adversarial
diseases,” Am. J. Physiol.-Heart Circul. Physiol., vol. 312, no. 2, pp. network,” IEEE Trans. Biomed. Eng, vol. 67, no. 1, pp. 298–311, 2019.
H201–H212, 2017. [27] J. Hong, B. Tan, N. D. Quang, P. Gupta, E. Lin, D. Wong, M. Ang,
[5] N. Eladawi, M. Elmogy, O. Helmy, A. Aboelfetouh, A. Riad, H. Sandhu, E. Lamoureux, L. Schmetterer, and J. Chua, “Intra-session repeatability
S. Schaal, and A. El-Baz, “Automatic blood vessels segmentation based of quantitative metrics using widefield optical coherence tomography
on different retinal maps from OCTA scans,” Comput. Biol. Med., angiography (OCTA) in elderly subjects,” Acta Ophthalmol., vol. 98,
vol. 89, pp. 150–161, Oct. 2017. no. 5, pp. e570–e578, 2020.
[6] Q. S. You, Y. Guo, J. Wang, X. Wei, A. Camino, P. Zang, C. J. Flaxel, [28] J. Lei, M. K. Durbin, Y. Shi, A. Uji, S. Balasubramanian, E. Bagh-
S. T. Bailey, D. Huang, Y. Jia et al., “Detection of clinically unsuspected dasaryan, M. Al-Sheikh, and S. R. Sadda, “Repeatability and repro-
retinal neovascularization with wide-field optical coherence tomography ducibility of superficial macular retinal vessel density measurements
angiography,” Retina, 2020. using optical coherence tomography angiography en face images,” JAMA
[7] Y. Ma, H. Hao, J. Xie, H. Fu, J. Zhang, J. Yang, Z. Wang, J. Liu, Ophthalmol., vol. 135, no. 10, pp. 1092–1098, 2017.
Y. Zheng, and Y. Zhao, “ROSE: a retinal OCT-angiography vessel [29] S. Men, C.-L. Chen, W. Wei, T.-Y. Lai, S. Song, and R. K. Wang,
segmentation dataset and new model,” IEEE Trans. Med. Imag., vol. 40, “Repeatability of vessel density measurement in human skin by OCT-
no. 3, pp. 928–939, 2020. based microangiography,” Skin Res Technol., vol. 23, no. 4, pp. 607–612,
[8] X. Wei, T. T. Hormel, Y. Guo, T. S. Hwang, and Y. Jia, “High-resolution 2017.
wide-field OCT angiography with a self-navigation method to correct [30] M.-W. Lee, K.-M. Kim, H.-B. Lim, Y.-J. Jo, and J.-Y. Kim, “Repeatabil-
microsaccades and blinks,” Biomed. Opt. Express, vol. 11, no. 6, pp. ity of vessel density measurements using optical coherence tomography
3234–3245, 2020. angiography in retinal diseases,” Br. J. Ophthalmol., vol. 103, no. 5, pp.
[9] W. Wieser, B. R. Biedermann, T. Klein, C. M. Eigenwillig, and R. Huber, 704–710, 2019.
“Multi-megahertz OCT: High quality 3d imaging at 20 million a-scans [31] S. Sankaranarayanan, Y. Balaji, A. Jain, S. N. Lim, and R. Chellappa,
and 4.5 gvoxels per second,” Opt. Express, vol. 18, no. 14, pp. 14 685– “Unsupervised domain adaptation for semantic segmentation with gans,”
14 704, 2010. arXiv preprint arXiv:1711.06969, vol. 2, no. 2, p. 2, 2017.
[10] X. Wei, T. T. Hormel, Y. Guo, and Y. Jia, “75-degree non-mydriatic [32] Y. Zhang, P. David, and B. Gong, “Curriculum domain adaptation for
single-volume optical coherence tomographic angiography,” Biomed. semantic segmentation of urban scenes,” in Proc. IEEE. Int. Conf.
Opt. Express, vol. 10, no. 12, pp. 6286–6295, 2019. Comput. Vision, 2017, pp. 2020–2030.
[11] P. Zang, G. Liu, M. Zhang, C. Dongye, J. Wang, A. D. Pechauer, [33] Y.-H. Tsai, W.-C. Hung, S. Schulter, K. Sohn, M.-H. Yang, and
T. S. Hwang, D. J. Wilson, D. Huang, D. Li et al., “Automated motion M. Chandraker, “Learning to adapt structured output space for semantic
correction using parallel-strip registration for wide-field en face OCT segmentation,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2018,
angiogram,” Biomed. Opt. Express, vol. 7, no. 7, pp. 2823–2836, 2016. pp. 7472–7481.
[12] A. Uji, S. Balasubramanian, J. Lei, E. Baghdasaryan, M. Al-Sheikh, [34] K. Bousmalis, N. Silberman, D. Dohan, D. Erhan, and D. Krishnan,
E. Borrelli, and S. R. Sadda, “Multiple enface image averaging for “Unsupervised pixel-level domain adaptation with generative adversarial
enhanced optical coherence tomography angiography imaging,” Acta networks,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2017, pp.
Ophthalmol., vol. 96, no. 7, pp. e820–e827, 2018. 3722–3731.
[13] B. Tan, A. Wong, and K. Bizheva, “Enhancement of morphological and
[35] C. Li and M. Wand, “Precomputed real-time texture synthesis with
vascular features in OCT images using a modified bayesian residual
markovian generative adversarial networks,” in Proc. Eur. Conf. Comput.
transform,” Biomed. Opt. Express, vol. 9, no. 5, pp. 2394–2406, 2018.
Vis. Springer, 2016, pp. 702–716.
[14] Z. Wang, D. Liu, J. Yang, W. Han, and T. Huang, “Deep networks
[36] C.-L. Tsai, C.-Y. Li, G. Yang, and K.-S. Lin, “The edge-driven dual-
for image super-resolution with sparse prior,” in Proc. IEEE. Int. Conf.
bootstrap iterative closest point algorithm for registration of multimodal
Comput. Vision, 2015, pp. 370–378.
fluorescein angiogram sequence,” IEEE Trans. Med. Imag., vol. 29,
[15] J. Sun, Z. Xu, and H.-Y. Shum, “Image super-resolution using gradient
no. 3, pp. 636–649, 2009.
profile prior,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog. IEEE,
[37] Y. Zhang, Y. Tian, Y. Kong, B. Zhong, and Y. Fu, “Residual dense
2008, pp. 1–8.
network for image super-resolution,” in Proc. IEEE Conf. Comput. Vis.
[16] C.-Y. Yang and M.-H. Yang, “Fast direct super-resolution by simple
Pattern Recog., 2018, pp. 2472–2481.
functions,” in Proc. IEEE. Int. Conf. Comput. Vision, 2013, pp. 561–
568. [38] Y. Chen, X. Dai, M. Liu, D. Chen, L. Yuan, and Z. Liu, “Dynamic
[17] J. Yang, J. Wright, T. S. Huang, and Y. Ma, “Image super-resolution via convolution: Attention over convolution kernels,” in Proc. IEEE Conf.
sparse representation,” IEEE Trans. Image Process, vol. 19, no. 11, pp. Comput. Vis. Pattern Recog., 2020, pp. 11 030–11 039.
2861–2873, 2010. [39] J. Hu, L. Shen, and G. Sun, “Squeeze-and-excitation networks,” in Proc.
[18] C. Dong, C. C. Loy, K. He, and X. Tang, “Image super-resolution using IEEE Conf. Comput. Vis. Pattern Recog., 2018, pp. 7132–7141.
deep convolutional networks,” IEEE Trans. Pattern Anal. Mach. Intell., [40] C. Huang and H.-H. Nien, “Multi chaotic systems based pixel shuffle
vol. 38, no. 2, pp. 295–307, 2015. for image encryption,” Opt. Commun., vol. 282, no. 11, pp. 2123–2127,
[19] K. Umehara, J. Ota, and T. Ishida, “Application of super-resolution 2009.
convolutional neural network for enhancing image resolution in chest [41] P. Bao, L. Zhang, and X. Wu, “Canny edge detection enhancement by
CT,” J. Digit. Imag., vol. 31, no. 4, pp. 441–450, 2018. scale multiplication,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 27,
[20] J. Shi, Q. Liu, C. Wang, Q. Zhang, S. Ying, and H. Xu, “Super-resolution no. 9, pp. 1485–1490, 2005.
reconstruction of MR image with a novel residual learning network [42] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,”
algorithm,” Phys. Med. Biol., vol. 63, no. 8, p. 085011, 2018. arXiv preprint arXiv:1412.6980, 2014.
[21] B. Niu, W. Wen, W. Ren, X. Zhang, L. Yang, S. Wang, K. Zhang, [43] R. Keys, “Cubic convolution interpolation for digital image processing,”
X. Cao, and H. Shen, “Single image super-resolution via a holistic IEEE Trans. Acoust., Speech, Signal Process., vol. 29, no. 6, pp. 1153–
attention network,” in Proc. Eur. Conf. Comput. Vis. Springer, 2020, 1160, 1981.
pp. 191–207. [44] B. Lim, S. Son, H. Kim, S. Nah, and K. Mu Lee, “Enhanced deep
[22] S. Zhao, J. Cui, Y. Sheng, Y. Dong, X. Liang, E. I. Chang, and Y. Xu, residual networks for single image super-resolution,” in Proc. IEEE
“Large scale image completion via co-modulated generative adversarial Conf. Comput. Vis. Pattern Recog. Workshops, 2017, pp. 136–144.
networks,” arXiv preprint arXiv:2103.10428, 2021. [45] S. Anwar and N. Barnes, “Densely residual laplacian super-resolution,”
[23] X. Wang, K. Yu, S. Wu, J. Gu, Y. Liu, C. Dong, Y. Qiao, and IEEE Trans. Pattern Anal. Mach. Intell., 2020.
C. Change Loy, “Esrgan: Enhanced super-resolution generative adver- [46] A. Shocher, N. Cohen, and M. Irani, ““zero-shot” super-resolution using
sarial networks,” in Proc. Eur. Conf. Comput. Vis.Workshops, 2018, pp. deep internal learning,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog.,
0–0. 2018, pp. 3118–3126.
[24] D. Mahapatra, B. Bozorgtabar, and R. Garnavi, “Image super-resolution [47] X. Ji, Y. Cao, Y. Tai, C. Wang, J. Li, and F. Huang, “Real-world super-
using progressive generative adversarial networks for medical image resolution via kernel estimation and noise injection,” in Proc. IEEE Conf.
analysis,” Comput. Med. Imag. Graph., vol. 71, pp. 30–39, 2019. Comput. Vis. Pattern Recog. Workshops, 2020, pp. 466–467.
13