Behavior of accelerated gradient methods near critical points of nonconvex functions

From MaRDI portal
Publication:2425178

DOI10.1007/S10107-018-1340-YzbMATH Open1415.90092arXiv1706.07993OpenAlexW2964265968WikidataQ129037120 ScholiaQ129037120MaRDI QIDQ2425178FDOQ2425178

Yanyan Li

Publication date: 26 June 2019

Published in: Mathematical Programming. Series A. Series B (Search for Journal in Brave)

Abstract: We examine the behavior of accelerated gradient methods in smooth nonconvex unconstrained optimization, focusing in particular on their behavior near strict saddle points. Accelerated methods are iterative methods that typically step along a direction that is a linear combination of the previous step and the gradient of the function evaluated at a point at or near the current iterate. (The previous step encodes gradient information from earlier stages in the iterative process.) We show by means of the stable manifold theorem that the heavy-ball method method is unlikely to converge to strict saddle points, which are points at which the gradient of the objective is zero but the Hessian has at least one negative eigenvalue. We then examine the behavior of the heavy-ball method and other accelerated gradient methods in the vicinity of a strict saddle point of a nonconvex quadratic function, showing that both methods can diverge from this point more rapidly than the steepest-descent method.


Full work available at URL: https://arxiv.org/abs/1706.07993





Cites Work


Cited In (11)






This page was built for publication: Behavior of accelerated gradient methods near critical points of nonconvex functions

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2425178)