You are on page 1of 8

LOW-POWER TO HIGH-

POWER TRANSLATION
Nick Tai, 2021/1/11
MODEL
A Noise-Robust Framework for Automatic Segmentation of COVID-19 Pneumonia Lesions From
CT Images - IEEE Journals & Magazine
COPLE-NET: General model architecture to map from source image to target image (image translation)

The characteristic of CopleNet is that (1) Channel attention is adapted; (2) ASPP to capture multi-scale semantics;
(3) Shortcuts are also an encoder-decoder submodule

2
EXPERIMENT SETUPS
Dataset 20201215

Channel A is adapted for POC

Training: Area 1~4 (some area contain 60 slices, some contain 70)

Validation: Area 5 (70 slices)

Assume this is a more difficult settings that the context of area 5 are never include in the training set

2D slices are used as samples

No depth information involved

This is a potential improvement that involving depth information (later I’ll proposed one for reference)

Optimizer is using AdaBelief with Lookahead and Gradient Centralization, Gradient norm clipping sets to 1

Epoch is set to 300, No augmentation, 4 GPUs each with batch size 8

LR is set to 1e-3 for first 72% of epochs, final 28% switches to cosine down to 0
3
TRAINING CURVE
Details will be in the attachment

Here should be L1, not L2


However, in the attachment “metrics.csv”,
the L2 is calculated

4
VISUALIZE
Details will be in the attachment

50W Translated 94W

50W Translated 94W 50W Translated 94W

Z=20 Z=40

Z=30 Z=50

5
POTENTIAL IMPROVEMENT (1)
More meaningful loss function

Although L1/L2 directly optimized the pixel intensities, but human won’t observe an image with a pixel-
independent manner

LPIPS is a recent technique to provide more meaningful metrics/loss function

It use a pretrained lightweight network (alexnet/vgg) for the distance measurement

Frequency-aware loss function (compare L1/L2 in frequency domain)

6
POTENTIAL IMPROVEMENT (2)
Depth-aware model

Assume the transformation mapping x to y is depth independent


(as this result model)

We model the relationship of y = f(x), where f is parameterized by


COPLE-NET

However, it intuitive to extend the modeling to consider the depth


information

To this end, the relationship is y = f(x, z_index)

Intuitive, we can train dedicated weights for each depth z, but it’s
impractical and slow to hot-swap the model weights

We can make BatchNorm condition to each different z_index, by


parameterize the gamma and beta in BatchNorm

Related works: (1) [1703.06868] Arbitrary Style Transfer in Real-time


with Adaptive Instance Normalization (arxiv.org); (2) [1707.00683]
Modulating early visual processing by language (arxiv.org)
7

You might also like