SGEM: stochastic gradient with energy and momentum

From MaRDI portal
Publication:6202786

DOI10.1007/S11075-023-01621-XarXiv2208.02208MaRDI QIDQ6202786FDOQ6202786


Authors: Hailiang Liu, Xuping Tian Edit this on Wikidata


Publication date: 26 March 2024

Published in: Numerical Algorithms (Search for Journal in Brave)

Abstract: In this paper, we propose SGEM, Stochastic Gradient with Energy and Momentum, to solve a large class of general non-convex stochastic optimization problems, based on the AEGD method that originated in the work [AEGD: Adaptive Gradient Descent with Energy. arXiv: 2010.05109]. SGEM incorporates both energy and momentum at the same time so as to inherit their dual advantages. We show that SGEM features an unconditional energy stability property, and derive energy-dependent convergence rates in the general nonconvex stochastic setting, as well as a regret bound in the online convex setting. A lower threshold for the energy variable is also provided. Our experimental results show that SGEM converges faster than AEGD and generalizes better or at least as well as SGDM in training some deep neural networks.


Full work available at URL: https://arxiv.org/abs/2208.02208




Recommendations




Cites Work


Cited In (1)





This page was built for publication: SGEM: stochastic gradient with energy and momentum

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6202786)