Sparse optimization on measures with over-parameterized gradient descent

From MaRDI portal
Publication:2149558

DOI10.1007/S10107-021-01636-ZzbMATH Open1494.90082arXiv1907.10300OpenAlexW3158438262MaRDI QIDQ2149558FDOQ2149558


Authors: Lénaïc Chizat Edit this on Wikidata


Publication date: 29 June 2022

Published in: Mathematical Programming. Series A. Series B (Search for Journal in Brave)

Abstract: Minimizing a convex function of a measure with a sparsity-inducing penalty is a typical problem arising, e.g., in sparse spikes deconvolution or two-layer neural networks training. We show that this problem can be solved by discretizing the measure and running non-convex gradient descent on the positions and weights of the particles. For measures on a d-dimensional manifold and under some non-degeneracy assumptions, this leads to a global optimization algorithm with a complexity scaling as log(1/epsilon) in the desired accuracy epsilon, instead of epsilond for convex methods. The key theoretical tools are a local convergence analysis in Wasserstein space and an analysis of a perturbed mirror descent in the space of measures. Our bounds involve quantities that are exponential in d which is unavoidable under our assumptions.


Full work available at URL: https://arxiv.org/abs/1907.10300




Recommendations



Cites Work


Cited In (9)





This page was built for publication: Sparse optimization on measures with over-parameterized gradient descent

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2149558)