SSN: learning sparse switchable normalization via SparsestMax

DOI10.1007/S11263-019-01269-YMaRDI QIDQ2056134zbMATH OpenOpenAlexFDO

Authors Wenqi Shao, Jingyu Li, Jiamin Ren, Ruimao Zhang, Ping Luo, Xiao-Gang Wang

Publication date 1 December 2021

Published in International Journal of Computer Vision (Search for Journal in Brave)

Full work available at URL https://arxiv.org/abs/1903.03793

zbMATH Keywords

optimization classification deep learning normalization

Mathematics Subject Classification ID

Artificial neural networks and deep learning (68T07)

Abstract: Normalization methods improve both optimization and generalization of ConvNets. To further boost performance, the recently-proposed switchable normalization (SN) provides a new perspective for deep learning: it learns to select different normalizers for different convolution layers of a ConvNet. However, SN uses softmax function to learn importance ratios to combine normalizers, leading to redundant computations compared to a single normalizer. This work addresses this issue by presenting Sparse Switchable Normalization (SSN) where the importance ratios are constrained to be sparse. Unlike

e l l_{1}

and

e l l_{0}

constraints that impose difficulties in optimization, we turn this constrained optimization problem into feed-forward computation by proposing SparsestMax, which is a sparse version of softmax. SSN has several appealing properties. (1) It inherits all benefits from SN such as applicability in various tasks and robustness to a wide range of batch sizes. (2) It is guaranteed to select only one normalizer for each normalization layer, avoiding redundant computations. (3) SSN can be transferred to various tasks in an end-to-end manner. Extensive experiments show that SSN outperforms its counterparts on various challenging benchmarks such as ImageNet, Cityscapes, ADE20K, and Kinetics.

Recommendations

Cites work

Cited in

(3)

Describes a project that uses

Uses Software

This page was built for publication: SSN: learning sparse switchable normalization via SparsestMax

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2056134)