Accelerating variance-reduced stochastic gradient methods

From MaRDI portal
Publication:2118092

DOI10.1007/S10107-020-01566-2zbMATH Open1489.90113arXiv1910.09494OpenAlexW3084718985MaRDI QIDQ2118092FDOQ2118092

Matthias J. Ehrhardt, Derek Driggs, Carola-Bibiane Schönlieb

Publication date: 22 March 2022

Published in: Mathematical Programming. Series A. Series B (Search for Journal in Brave)

Abstract: Variance reduction is a crucial tool for improving the slow convergence of stochastic gradient descent. Only a few variance-reduced methods, however, have yet been shown to directly benefit from Nesterov's acceleration techniques to match the convergence rates of accelerated gradient methods. Such approaches rely on "negative momentum", a technique for further variance reduction that is generally specific to the SVRG gradient estimator. In this work, we show that negative momentum is unnecessary for acceleration and develop a universal acceleration framework that allows all popular variance-reduced methods to achieve accelerated convergence rates. The constants appearing in these rates, including their dependence on the number of functions n, scale with the mean-squared-error and bias of the gradient estimator. In a series of numerical experiments, we demonstrate that versions of SAGA, SVRG, SARAH, and SARGE using our framework significantly outperform non-accelerated versions and compare favourably with algorithms using negative momentum.


Full work available at URL: https://arxiv.org/abs/1910.09494





Cites Work


Cited In (7)

Uses Software






This page was built for publication: Accelerating variance-reduced stochastic gradient methods

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2118092)