Make \(\ell_1\) regularization effective in training sparse CNN (Q782914)
From MaRDI portal
scientific article
Language | Label | Description | Also known as |
---|---|---|---|
English | Make \(\ell_1\) regularization effective in training sparse CNN |
scientific article |
Statements
Make \(\ell_1\) regularization effective in training sparse CNN (English)
0 references
29 July 2020
0 references
This paper considers the training of sparse deep neural networks. The aim of the paper is to study sparse training algorithms for a special class of deep neural networks, namely convolutional neural networks (CNN). The paper focuses on the network pruning as the most popular compressing method due to its good compatibility and competitive performance. The authors report that the simple dual averaging (SDA) method , with some appropriate modification, can also be made highly effectivе with an \( \ell _ 1\) regularization to obtain sparse convolutional neural networks. In particular, by combining it with an \(\ell _ 1\) regularization , the authors develop the corresponding regularized dual averaging (RDA) method. \textit{L. Xiao} [J. Mach. Learn. Res. 11, 2543--2596 (2010; Zbl 1242.62011)] originally developed the RDA method specifically for convex problem. The RDA method proposed by the authors in this paper (using proper initialization and adaptivity) with an \(\ell _ 1\) regularization achieves a state-of-the-art sparsity for the highly non-convex CNN compared to other weight pruning methods without a compromising generalization accuracy.
0 references
sparse optimization
0 references
\(\ell_1\) regularization
0 references
dual averaging
0 references
0 references
0 references