Stochastic optimization with momentum: convergence, fluctuations, and traps avoidance

From MaRDI portal
Publication:2233558

DOI10.1214/21-EJS1880zbMATH Open1471.62442arXiv2012.04002OpenAlexW3192961449MaRDI QIDQ2233558FDOQ2233558


Authors: Anas Barakat, Pascal Bianchi, W. Hachem, Sholom Schechtman Edit this on Wikidata


Publication date: 11 October 2021

Published in: Electronic Journal of Statistics (Search for Journal in Brave)

Abstract: In this paper, a general stochastic optimization procedure is studied, unifying several variants of the stochastic gradient descent such as, among others, the stochastic heavy ball method, the Stochastic Nesterov Accelerated Gradient algorithm (S-NAG), and the widely used Adam algorithm. The algorithm is seen as a noisy Euler discretization of a non-autonomous ordinary differential equation, recently introduced by Belotto da Silva and Gazeau, which is analyzed in depth. Assuming that the objective function is non-convex and differentiable, the stability and the almost sure convergence of the iterates to the set of critical points are established. A noteworthy special case is the convergence proof of S-NAG in a non-convex setting. Under some assumptions, the convergence rate is provided under the form of a Central Limit Theorem. Finally, the non-convergence of the algorithm to undesired critical points, such as local maxima or saddle points, is established. Here, the main ingredient is a new avoidance of traps result for non-autonomous settings, which is of independent interest.


Full work available at URL: https://arxiv.org/abs/2012.04002




Recommendations




Cites Work


Cited In (12)

Uses Software





This page was built for publication: Stochastic optimization with momentum: convergence, fluctuations, and traps avoidance

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2233558)