You are on page 1of 8

AdderNet: Do We Really Need Multiplications in Deep Learning?

Hanting Chen1,2, Yunhe Wang2, Chunjing Xu2, Boxin Shi1, Chao Xu1, Qi Tian2, Chang Xu3
1Peking University; 2Huawei Technologies; 3The University of Sydney

In this paper, we propose adder networks, which trade these massive


multiplications in deep neural networks, especially convolutional
neural networks (CNNs), for much cheaper additions to reduce
computation costs.

AdderNet: Do We Really Need Multiplications in Deep Learning? CVPR 2020 conference paper
Background
• High-power consumption of deep networks has blocked modern deep learning systems from being
deployed on mobile devices, e.g. smart phone, camera, and watch.

• Existing methods focus on pruning, quantization, etc. while compressed networks still involve massive
multiplications.

Compressed networks still contain massive multiplications

AdderNet: Do We Really Need Multiplications in Deep Learning? Backgrounds


Background
• Compared with cheap addition operation, multiplication operation is of much higher energy cost.

• Can we replace the multiplications in CNNs with additions?

AdderNet: Do We Really Need Multiplications in Deep Learning? Backgrounds


Method

CNN:

X
AdderNet:
cross-correlation

• Convolutions are exactly cross-correlation to measure the


similarity between input feature and convolution filters.

• Instead, AdderNets take the L1-norm distance between


filters and input feature as the output response, which could L1 distance
also measure their similarity.

AdderNet: Do We Really Need Multiplications in Deep Learning? Method


Method

• The gradients of AdderNets are sign gradient (±1), which is unsuitable to optimize the neural networks of a huge
number of parameters.

• To achieve a better performance, we develop a special back-propagation approach for AdderNets by investigating the
full-precision gradient.

• Besides, the norms of gradients of filters in AdderNets are much smaller than that in CNNs, which could slow down
the update of filters in AdderNets.

• We propose an adaptive learning rate strategy to enhance the training procedure of AdderNets according to the
magnitude of each neuron's gradient.

AdderNet: Do We Really Need Multiplications in Deep Learning? Method


Results

• As a result, the proposed AdderNets achieve comparable performance with CNNs with few
multiplications on the CIFAR and ImageNet dataset.

AdderNet: Do We Really Need Multiplications in Deep Learning? Results


Results

• We visualize filters, features and distribution of weights for AdderNets (left) and CNNs (right).

AdderNet: Do We Really Need Multiplications in Deep Learning? Results


Thank You!
Questions?

Mail: chenhanting@pku.edu.cn

Github: https://github.com/huawei-noah/AdderNet

AdderNet: Do We Really Need Multiplications in Deep Learning? CVPR 2020 conference paper

You might also like