Stopping criteria for, and strong convergence of, stochastic gradient descent on Bottou-Curtis-Nocedal functions
From MaRDI portal
Publication:2089787
Abstract: Stopping criteria for Stochastic Gradient Descent (SGD) methods play important roles from enabling adaptive step size schemes to providing rigor for downstream analyses such as asymptotic inference. Unfortunately, current stopping criteria for SGD methods are often heuristics that rely on asymptotic normality results or convergence to stationary distributions, which may fail to exist for nonconvex functions and, thereby, limit the applicability of such stopping criteria. To address this issue, in this work, we rigorously develop two stopping criteria for SGD that can be applied to a broad class of nonconvex functions, which we term Bottou-Curtis-Nocedal functions. Moreover, as a prerequisite for developing these stopping criteria, we prove that the gradient function evaluated at SGD's iterates converges strongly to zero for Bottou-Curtis-Nocedal functions, which addresses an open question in the SGD literature. As a result of our work, our rigorously developed stopping criteria can be used to develop new adaptive step size schemes or bolster other downstream analyses for nonconvex functions.
Recommendations
- Convergence rates for the stochastic gradient descent method for non-convex objective functions
- Convergence of constant step stochastic gradient descent for non-smooth non-convex functions
- Stopping rules for gradient methods for non-convex problems with additive noise in gradient
- Stochastic gradient descent with Polyak's learning rate
- Stochastic subgradient method converges on tame functions
Cites work
- scientific article; zbMATH DE number 3887444 (Why is no real title available?)
- scientific article; zbMATH DE number 4085419 (Why is no real title available?)
- scientific article; zbMATH DE number 3526471 (Why is no real title available?)
- scientific article; zbMATH DE number 1246230 (Why is no real title available?)
- scientific article; zbMATH DE number 893887 (Why is no real title available?)
- A Stochastic Approximation Method
- A stopping rule for the Robbins-Monro method
- Asymptotic Statistics
- Bounded Length Confidence Intervals for the Zero of a Regression Function
- Kalman-based stochastic gradient method with stop condition and insensitivity to conditioning
- Mini-batch stochastic approximation methods for nonconvex stochastic composite optimization
- Mixed effects models for complex data
- On a Stochastic Approximation Method
- On a new stopping rule for stochastic approximation
- Optimization methods for large-scale machine learning
- Probability. Theory and examples.
- Robust Stochastic Approximation Approach to Stochastic Programming
- Stochastic Approximation of Minima with Improved Asymptotic Speed
- Stochastic Estimation of the Maximum of a Regression Function
- Stochastic proximal quasi-Newton methods for non-convex composite optimization
- Stopping times for stochastic approximation procedures
- stochastic quasigradient methods and their application to system optimization†
Cited in
(5)- Gradient estimation for smooth stopping criteria
- Gradient descent in the absence of global Lipschitz continuity of the gradients
- Classical and fast parameters tuning in nearest neighbors with stop condition
- Stopping rules for gradient methods for non-convex problems with additive noise in gradient
- Robust optimization of control parameters for WEC arrays using stochastic methods
This page was built for publication: Stopping criteria for, and strong convergence of, stochastic gradient descent on Bottou-Curtis-Nocedal functions
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2089787)