Optimal convergence rates for Nesterov acceleration
From MaRDI portal
Publication:5206941
Abstract: In this paper, we study the behavior of solutions of the ODE associated to Nesterov acceleration. It is well-known since the pioneering work of Nesterov that the rate of convergence is optimal for the class of convex functions with Lipschitz gradient. In this work, we show that better convergence rates can be obtained with some additional geometrical conditions, such as L ojasiewicz property. More precisely, we prove the optimal convergence rates that can be obtained depending on the geometry of the function to minimize. The convergence rates are new, and they shed new light on the behavior of Nesterov acceleration schemes. We prove in particular that the classical Nesterov scheme may provide convergence rates that are worse than the classical gradient descent scheme on sharp functions: for instance, the convergence rate for strongly convex functions is not geometric for the classical Nesterov scheme (while it is the case for the gradient descent algorithm). This shows that applying the classical Nesterov acceleration on convex functions without looking more at the geometrical properties of the objective functions may lead to sub-optimal algorithms.
Recommendations
- Convergence rates of an inertial gradient descent algorithm under growth and flatness conditions
- The rate of convergence of Nesterov's accelerated forward-backward method is actually faster than \(1/k^2\)
- Rate of convergence of the Nesterov accelerated gradient method in the subcritical case α ≤ 3
- Convergence rates of the heavy ball method for quasi-strongly convex optimization
- Generalized Nesterov's accelerated proximal gradient algorithms with convergence rate of order \(o(1/k^2)\)
Cites work
- scientific article; zbMATH DE number 3850830 (Why is no real title available?)
- scientific article; zbMATH DE number 1993614 (Why is no real title available?)
- scientific article; zbMATH DE number 3371284 (Why is no real title available?)
- A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems
- A differential equation for modeling Nesterov's accelerated gradient method: theory and insights
- Asymptotic for a second-order evolution equation with convex potential and vanishing damping term
- Asymptotic for the perturbed heavy ball system with vanishing damping term
- Asymptotic stabilization of inertial gradient dynamics with time-dependent viscosity
- Asymptotics for a second-order differential equation with nonautonomous damping and an integrable source term
- Asymptotics for some vibro-impact problems with a linear dissipation term
- Convergence of the forward-backward algorithm: beyond the worst-case with the help of geometry
- Convergence rates of an inertial gradient descent algorithm under growth and flatness conditions
- Error bounds and Hölder metric subregularity
- Fast convergence of inertial dynamics and algorithms with asymptotic vanishing viscosity
- From error bounds to the complexity of first-order descent methods for convex functions
- On damped second-order gradient systems
- On semi- and subanalytic geometry
- On the convergence of the iterates of the ``fast iterative shrinkage/thresholding algorithm
- On the convergence of the proximal algorithm for nonsmooth functions involving analytic features
- On the long time behavior of second order differential equations with asymptotically small dissipation
- Rate of convergence of the Nesterov accelerated gradient method in the subcritical case α ≤ 3
- Stability of over-relaxations for the forward-backward algorithm, application to FISTA
- THE HEAVY BALL WITH FRICTION METHOD, I. THE CONTINUOUS DYNAMICAL SYSTEM: GLOBAL EXPLORATION OF THE LOCAL MINIMA OF A REAL-VALUED FUNCTION BY ASYMPTOTIC ANALYSIS OF A DISSIPATIVE DYNAMICAL SYSTEM
- The Differential Inclusion Modeling FISTA Algorithm and Optimality of Convergence Rate in the Case b $\leq3$
Cited in
(36)- Stochastic optimization with momentum: convergence, fluctuations, and traps avoidance
- Convergence rates of damped inerial dynamics from multi-degree-of-freedom system
- Proximal Gradient Methods for Machine Learning and Imaging
- Convergence results of two-step inertial proximal point algorithm
- Convergence rates of first- and higher-order dynamics for solving linear ill-posed problems
- Convergence rates of damped inertial dynamics under geometric conditions and perturbations
- An extension of the second order dynamical system that models Nesterov's convex gradient method
- A differential equation for modeling Nesterov's accelerated gradient method: theory and insights
- From the ravine method to the Nesterov method and vice versa: a dynamical system perspective
- Weak and linear convergence of a generalized proximal point algorithm with alternating inertial steps for a monotone inclusion problem
- Stochastic differential equations for modeling first order optimization methods
- Inertial Newton algorithms avoiding strict saddle points
- Rate of convergence of the Nesterov accelerated gradient method in the subcritical case α ≤ 3
- The rate of convergence of Nesterov's accelerated forward-backward method is actually faster than \(1/k^2\)
- Convergence rates of an inertial gradient descent algorithm under growth and flatness conditions
- Convergence rates of the heavy ball method for quasi-strongly convex optimization
- Sharpness, restart, and acceleration
- Convergence rate of a relaxed inertial proximal algorithm for convex minimization
- Generalized proximal point algorithms with correction terms and extrapolation
- Convergence rates of inertial primal-dual dynamical methods for separable convex optimization problems
- Convergence rates for an inertial algorithm of gradient type associated to a smooth non-convex minimization
- A gradient-type algorithm with backward inertial steps associated to a nonconvex minimization problem
- The Nesterov accelerated gradient algorithm for auto-regressive exogenous models with random lost measurements: interpolation method and auxiliary model method
- Inertial projection and contraction algorithms with larger step sizes for solving quasimonotone variational inequalities
- Inertial primal-dual dynamics with damping and scaling for linearly constrained convex optimization problems
- Projection methods with alternating inertial steps for variational inequalities: weak and linear convergence
- Double inertial parameters forward-backward splitting method: Applications to compressed sensing, image processing, and SCAD penalty problems
- Fast convergence of inertial dynamics with Hessian-driven damping under geometry assumptions
- Optimal convergence rate of inertial gradient system with flat geometries and perturbations
- Convergence rates of the heavy-ball method under the Łojasiewicz property
- On the effect of perturbations in first-order optimization methods with inertia and Hessian driven damping
- Optimal decay rates for semi-linear non-autonomous evolution equations with vanishing damping
- Optimal convergence rates for damped inertial gradient dynamics with flat geometries
- FISTA is an automatic geometrically optimized algorithm for strongly convex functions
- Understanding the acceleration phenomenon via high-resolution differential equations
- Factor-\(\sqrt{2}\) acceleration of accelerated gradient methods
This page was built for publication: Optimal convergence rates for Nesterov acceleration
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5206941)