SGEM: stochastic gradient with energy and momentum
From MaRDI portal
Publication:6202786
DOI10.1007/S11075-023-01621-XarXiv2208.02208MaRDI QIDQ6202786FDOQ6202786
Authors: Hailiang Liu, Xuping Tian
Publication date: 26 March 2024
Published in: Numerical Algorithms (Search for Journal in Brave)
Abstract: In this paper, we propose SGEM, Stochastic Gradient with Energy and Momentum, to solve a large class of general non-convex stochastic optimization problems, based on the AEGD method that originated in the work [AEGD: Adaptive Gradient Descent with Energy. arXiv: 2010.05109]. SGEM incorporates both energy and momentum at the same time so as to inherit their dual advantages. We show that SGEM features an unconditional energy stability property, and derive energy-dependent convergence rates in the general nonconvex stochastic setting, as well as a regret bound in the online convex setting. A lower threshold for the energy variable is also provided. Our experimental results show that SGEM converges faster than AEGD and generalizes better or at least as well as SGDM in training some deep neural networks.
Full work available at URL: https://arxiv.org/abs/2208.02208
Recommendations
- An adaptive gradient method with energy and momentum
- Stochastic gradient Hamiltonian Monte Carlo for non-convex learning
- Stochastic optimization with momentum: convergence, fluctuations, and traps avoidance
- Global convergence of stochastic gradient Hamiltonian Monte Carlo for nonconvex stochastic optimization: nonasymptotic performance bounds and momentum-based acceleration
Cites Work
- Adaptive subgradient methods for online learning and stochastic optimization
- A Stochastic Approximation Method
- Deep learning
- Some methods of speeding up the convergence of iteration methods
- Convergence analysis of gradient descent stochastic algorithms
- Linear, first and second-order, unconditionally energy stable numerical schemes for the phase field model of homopolymer blends
- Optimization methods for large-scale machine learning
- Numerical approximations for a phase field dendritic crystal growth model based on the invariant energy quadratization approach
- Katyusha: the first direct acceleration of stochastic gradient methods
Cited In (1)
This page was built for publication: SGEM: stochastic gradient with energy and momentum
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6202786)