You are on page 1of 12

A CONVNET FOR THE 2020S

Z. Liu, H. Mao, C. -Y. Wu, C. Feichtenhofer, T. Darrell and S. Xie

2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)


March 2022
ResNet Swin

SWIN TRANSFORMER ImageNet 76.1% 81.3%

OVERVIEW

3
ResNet Swin

TRAINING STRATEGY 76.1% 78.8% 81.3%

Optimizer : SGD + momentum → AdamW


Epochs : 90 → 300
Data augmentation: Mixup, Cutmix, RandAugment, Random Erasing
Regularization : Stochastic Depth, Label Smoothing

MixUp CutMix RandAugment Random Erasing

4
ResNet Swin

MACRO DESIGN 78.8% 79.4% 81.3%

STAGE COMPUTE RATIO

5
ResNet Swin

MACRO DESIGN 79.4% 79.5% 81.3%

STEM TO PATCHIFY

6
ResNet Swin

RESNEXTIFY 79.5% 80.5% 81.3%

Grouped Convolution :
Depthwise + Pontwise Convolutions
Increase the network width : 64 → 96

7
ResNet Swin

MACRO-DESIGN 80.5% 80.6% 81.3%

Modifying the bottleneck :


Depthwise conv layer
ResNext Block Inverted BottleNeck
moved up

Increase kernel size : 3x3 → 7x7

8
ResNet Swin

MICRO DESIGN 80.6% 81.4% 81.3%

Replacing ReLU with GELU


Reducing number of activation functions
Reducing number of normalization layers

9
ResNet Swin

MICRO DESIGN 81.4% 82.0% 81.3%

Batch Normalization → Layer Normalization


Separating downsampling layers

10
RESULTS
CLASSIFICATION

11
RESULTS
DETECTION

11
THANK YOU

You might also like