Conjugate-gradient-based Adam for nonconvex stochastic optimization and its application to deep learning
From MaRDI portal
Publication:5052078
Recommendations
- Convergence and dynamical behavior of the ADAM algorithm for nonconvex stochastic optimization
- Adaptive methods using element-wise \(p\)th power of stochastic gradient for nonconvex optimization in deep neural networks
- Convergence of the RMSProp deep learning method with penalty for nonconvex optimization
- scientific article; zbMATH DE number 1928800
- A Diffusion Approximation Theory of Momentum Stochastic Gradient Descent in Nonconvex Optimization
Cited in
(11)- Theoretical analysis of Adam using hyperparameters close to one without Lipschitz smoothness
- Stochastic three-term conjugate gradient method with variance technique for non-convex learning
- Convergence analysis of AdaBound with relaxed bound functions for non-convex optimization
- Convergence of the RMSProp deep learning method with penalty for nonconvex optimization
- Adaptive methods using element-wise \(p\)th power of stochastic gradient for nonconvex optimization in deep neural networks
- Taming Neural Networks with TUSLA: Nonconvex Learning via Adaptive Stochastic Gradient Langevin Algorithms
- Convergence and dynamical behavior of the ADAM algorithm for nonconvex stochastic optimization
- Block-cyclic stochastic coordinate descent for deep neural networks
- Combining stochastic adaptive cubic regularization with negative curvature for nonconvex optimization
- Stochastic generalized gradient methods for training nonconvex nonsmooth neural networks
- A modification of adaptive moment estimation (Adam) for machine learning
This page was built for publication: Conjugate-gradient-based Adam for nonconvex stochastic optimization and its application to deep learning
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5052078)