Conjugate-gradient-based Adam for nonconvex stochastic optimization and its application to deep learning
From MaRDI portal
Publication:5052078
zbMATH Open1498.65092MaRDI QIDQ5052078FDOQ5052078
Authors: Yu Kobayashi, Hideaki Iiduka
Publication date: 18 November 2022
Full work available at URL: http://yokohamapublishers.jp/online2/opjnca/vol23/p337.html
Recommendations
- Convergence and dynamical behavior of the ADAM algorithm for nonconvex stochastic optimization
- Adaptive methods using element-wise \(p\)th power of stochastic gradient for nonconvex optimization in deep neural networks
- Convergence of the RMSProp deep learning method with penalty for nonconvex optimization
- scientific article; zbMATH DE number 1928800
- A Diffusion Approximation Theory of Momentum Stochastic Gradient Descent in Nonconvex Optimization
Numerical mathematical programming methods (65K05) Applications of mathematical programming (90C90) Nonconvex programming, global optimization (90C26) Stochastic programming (90C15)
Cited In (11)
- Theoretical analysis of Adam using hyperparameters close to one without Lipschitz smoothness
- Stochastic three-term conjugate gradient method with variance technique for non-convex learning
- Convergence analysis of AdaBound with relaxed bound functions for non-convex optimization
- Convergence of the RMSProp deep learning method with penalty for nonconvex optimization
- Adaptive methods using element-wise \(p\)th power of stochastic gradient for nonconvex optimization in deep neural networks
- Taming Neural Networks with TUSLA: Nonconvex Learning via Adaptive Stochastic Gradient Langevin Algorithms
- Convergence and dynamical behavior of the ADAM algorithm for nonconvex stochastic optimization
- Block-cyclic stochastic coordinate descent for deep neural networks
- Combining stochastic adaptive cubic regularization with negative curvature for nonconvex optimization
- Stochastic generalized gradient methods for training nonconvex nonsmooth neural networks
- A modification of adaptive moment estimation (Adam) for machine learning
This page was built for publication: Conjugate-gradient-based Adam for nonconvex stochastic optimization and its application to deep learning
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5052078)