Accelerating variance-reduced stochastic gradient methods
From MaRDI portal
Publication:2118092
Abstract: Variance reduction is a crucial tool for improving the slow convergence of stochastic gradient descent. Only a few variance-reduced methods, however, have yet been shown to directly benefit from Nesterov's acceleration techniques to match the convergence rates of accelerated gradient methods. Such approaches rely on "negative momentum", a technique for further variance reduction that is generally specific to the SVRG gradient estimator. In this work, we show that negative momentum is unnecessary for acceleration and develop a universal acceleration framework that allows all popular variance-reduced methods to achieve accelerated convergence rates. The constants appearing in these rates, including their dependence on the number of functions , scale with the mean-squared-error and bias of the gradient estimator. In a series of numerical experiments, we demonstrate that versions of SAGA, SVRG, SARAH, and SARGE using our framework significantly outperform non-accelerated versions and compare favourably with algorithms using negative momentum.
Recommendations
- Accelerated stochastic variance reduction for a class of convex optimization problems
- Katyusha: the first direct acceleration of stochastic gradient methods
- Katyusha: the first direct acceleration of stochastic gradient methods
- Asymptotic estimates for \(r\)-Whitney numbers of the second kind
- Stochastic quasi-gradient methods: variance reduction via Jacobian sketching
Cites work
- scientific article; zbMATH DE number 845714 (Why is no real title available?)
- scientific article; zbMATH DE number 7255141 (Why is no real title available?)
- A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems
- A Stochastic Approximation Method
- A proximal stochastic gradient method with progressive variance reduction
- Accelerated gradient methods for nonconvex nonlinear and stochastic programming
- An optimal randomized incremental gradient method
- Catalyst acceleration for first-order convex optimization: from theory to practice
- Ergodic convergence to a zero of the sum of monotone operators in Hilbert space
- Exact matrix completion via convex optimization
- Interior Gradient and Proximal Methods for Convex and Conic Optimization
- Katyusha: the first direct acceleration of stochastic gradient methods
- Linear coupling: an ultimate unification of gradient and mirror descent
- Minimizing finite sums with the stochastic average gradient
- Optimization methods for large-scale machine learning
- Robust principal component analysis?
- Signal Recovery by Proximal Forward-Backward Splitting
- SpiderBoost
- Stochastic primal-dual coordinate method for regularized empirical risk minimization
Cited in
(14)- Accelerated doubly stochastic gradient descent for tensor CP decomposition
- A mini-batch stochastic conjugate gradient algorithm with variance reduction
- An aggressive reduction on the complexity of optimization for non-strongly convex objectives
- Accelerated stochastic variance reduction for a class of convex optimization problems
- An improvement of stochastic gradient descent approach for mean-variance portfolio optimization problem
- Stochastic quasi-gradient methods: variance reduction via Jacobian sketching
- Accelerated gradient methods with absolute and relative noise in the gradient
- Nonconvex optimization with inertial proximal stochastic variance reduction gradient
- Cocoercivity, smoothness and bias in variance-reduced stochastic gradient methods
- Batching Adaptive Variance Reduction
- Katyusha: the first direct acceleration of stochastic gradient methods
- Variance reduction for root-finding problems
- Katyusha: the first direct acceleration of stochastic gradient methods
- Analysis and improvement for a class of variance reduced methods
This page was built for publication: Accelerating variance-reduced stochastic gradient methods
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2118092)