Cocoercivity, smoothness and bias in variance-reduced stochastic gradient methods
From MaRDI portal
Publication:2674579
DOI10.1007/S11075-022-01280-4zbMATH Open1500.65030arXiv1903.09009OpenAlexW4223913444MaRDI QIDQ2674579FDOQ2674579
Authors: Martin Morin, Pontus Giselsson
Publication date: 14 September 2022
Published in: Numerical Algorithms (Search for Journal in Brave)
Abstract: With the purpose of examining biased updates in variance-reduced stochastic gradient methods, we introduce SVAG, a SAG/SAGA-like method with adjustable bias. SVAG is analyzed in a cocoercive root-finding setting, a setting which yields the same results as in the usual smooth convex optimization setting for the ordinary proximal-gradient method. We show that the same is not true for SVAG when biased updates are used. The step-size requirements for when the operators are gradients are significantly less restrictive compared to when they are not. This highlights the need to not rely solely on cocoercivity when analyzing variance-reduced methods meant for optimization. Our analysis either match or improve on previously known convergence conditions for SAG and SAGA. However, in the biased cases they still do not correspond well with practical experiences and we therefore examine the effect of bias numerically on a set of classification problems. The choice of bias seem to primarily affect the early stages of convergence and in most cases the differences vanish in the later stages of convergence. However, the effect of the bias choice is still significant in a couple of cases.
Full work available at URL: https://arxiv.org/abs/1903.09009
Recommendations
- Accelerating variance-reduced stochastic gradient methods
- scientific article; zbMATH DE number 7267112
- Analysis of biased stochastic gradient descent using sequential semidefinite programs
- Analysis and improvement for a class of variance reduced methods
- An analysis of stochastic variance reduced gradient for linear inverse problems *
Numerical mathematical programming methods (65K05) Convex programming (90C25) Stochastic programming (90C15)
Cites Work
- Julia: a fresh approach to numerical computing
- Splitting Algorithms for the Sum of Two Nonlinear Operators
- Introductory lectures on convex optimization. A basic course.
- Solving monotone inclusions via compositions of nonexpansive averaged operators
- Monotone Operators and the Proximal Point Algorithm
- Convex programming in Hilbert space
- Primal-dual splitting algorithm for solving inclusions with mixtures of composite, Lipschitzian, and parallel-sum type monotone operators
- A Modified Forward-Backward Splitting Method for Maximal Monotone Mappings
- Title not available (Why is that?)
- Stochastic quasi-Fejér block-coordinate fixed point iterations with random sweeping
- Title not available (Why is that?)
- A proximal stochastic gradient method with progressive variance reduction
- Stochastic dual coordinate ascent methods for regularized loss minimization
- Convex analysis and monotone operator theory in Hilbert spaces
- A generalized forward-backward splitting
- Semi-stochastic coordinate descent
- A three-operator splitting scheme and its optimization applications
- Primal-dual proximal algorithms for structured convex optimization: a unifying framework
- Minimizing finite sums with the stochastic average gradient
- Asynchronous block-iterative primal-dual decomposition methods for monotone inclusions
- Forward-Backward-Half Forward Algorithm for Solving Monotone Inclusions
- Katyusha: the first direct acceleration of stochastic gradient methods
- Stochastic quasi-gradient methods: variance reduction via Jacobian sketching
- Solving composite fixed point problems with block updates
- Nonlinear forward-backward splitting with projection correction
Cited In (1)
Uses Software
This page was built for publication: Cocoercivity, smoothness and bias in variance-reduced stochastic gradient methods
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2674579)