SGEM: stochastic gradient with energy and momentum

DOI10.1007/S11075-023-01621-XarXiv2208.02208MaRDI QIDQ6202786FDOQ6202786

Publication date: 26 March 2024

Published in: Numerical Algorithms (Search for Journal in Brave)

Abstract: In this paper, we propose SGEM, Stochastic Gradient with Energy and Momentum, to solve a large class of general non-convex stochastic optimization problems, based on the AEGD method that originated in the work [AEGD: Adaptive Gradient Descent with Energy. arXiv: 2010.05109]. SGEM incorporates both energy and momentum at the same time so as to inherit their dual advantages. We show that SGEM features an unconditional energy stability property, and derive energy-dependent convergence rates in the general nonconvex stochastic setting, as well as a regret bound in the online convex setting. A lower threshold for the energy variable is also provided. Our experimental results show that SGEM converges faster than AEGD and generalizes better or at least as well as SGDM in training some deep neural networks.

Full work available at URL: https://arxiv.org/abs/2208.02208

Recommendations

zbMATH Keywords

stochastic optimization energy stability momentum gradient descent

Mathematics Subject Classification ID

Numerical optimization and variational techniques (65K10) Stochastic programming (90C15)

Cites Work

Cited In (1)

Anderson acceleration of gradient methods with energy for optimization problems

This page was built for publication: SGEM: stochastic gradient with energy and momentum

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6202786)