Make \(\ell_1\) regularization effective in training sparse CNN (Q782914)

From MaRDI portal
scientific article
Language Label Description Also known as
English
Make \(\ell_1\) regularization effective in training sparse CNN
scientific article

    Statements

    Make \(\ell_1\) regularization effective in training sparse CNN (English)
    0 references
    0 references
    29 July 2020
    0 references
    This paper considers the training of sparse deep neural networks. The aim of the paper is to study sparse training algorithms for a special class of deep neural networks, namely convolutional neural networks (CNN). The paper focuses on the network pruning as the most popular compressing method due to its good compatibility and competitive performance. The authors report that the simple dual averaging (SDA) method , with some appropriate modification, can also be made highly effectivе with an \( \ell _ 1\) regularization to obtain sparse convolutional neural networks. In particular, by combining it with an \(\ell _ 1\) regularization , the authors develop the corresponding regularized dual averaging (RDA) method. \textit{L. Xiao} [J. Mach. Learn. Res. 11, 2543--2596 (2010; Zbl 1242.62011)] originally developed the RDA method specifically for convex problem. The RDA method proposed by the authors in this paper (using proper initialization and adaptivity) with an \(\ell _ 1\) regularization achieves a state-of-the-art sparsity for the highly non-convex CNN compared to other weight pruning methods without a compromising generalization accuracy.
    0 references
    0 references
    sparse optimization
    0 references
    \(\ell_1\) regularization
    0 references
    dual averaging
    0 references
    0 references
    0 references