Sparse optimization on measures with over-parameterized gradient descent

DOI10.1007/S10107-021-01636-ZzbMATH Open1494.90082arXiv1907.10300OpenAlexW3158438262MaRDI QIDQ2149558FDOQ2149558

Publication date: 29 June 2022

Published in: Mathematical Programming. Series A. Series B (Search for Journal in Brave)

Abstract: Minimizing a convex function of a measure with a sparsity-inducing penalty is a typical problem arising, e.g., in sparse spikes deconvolution or two-layer neural networks training. We show that this problem can be solved by discretizing the measure and running non-convex gradient descent on the positions and weights of the particles. For measures on a

d

-dimensional manifold and under some non-degeneracy assumptions, this leads to a global optimization algorithm with a complexity scaling as

l o g (1 / e p s i l o n)

in the desired accuracy

e p s i l o n

, instead of

e p s i l o n^{- d}

for convex methods. The key theoretical tools are a local convergence analysis in Wasserstein space and an analysis of a perturbed mirror descent in the space of measures. Our bounds involve quantities that are exponential in

d

which is unavoidable under our assumptions.

Full work available at URL: https://arxiv.org/abs/1907.10300

Recommendations

Mathematics Subject Classification ID

Numerical optimization and variational techniques (65K10) Nonconvex programming, global optimization (90C26) Numerical methods based on nonlinear programming (49M37)

Cites Work

Cited In (12)

This page was built for publication: Sparse optimization on measures with over-parameterized gradient descent

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2149558)