Adaptive methods using element-wise pth power of stochastic gradient for nonconvex optimization in deep neural networks
From MaRDI portal
Publication:5084579
Recommendations
- Conjugate-gradient-based Adam for nonconvex stochastic optimization and its application to deep learning
- A heuristic adaptive fast gradient method in stochastic optimization problems
- An adaptive gradient method with energy and momentum
- Convergence of the RMSProp deep learning method with penalty for nonconvex optimization
- Adaptive sampling for incremental optimization using stochastic gradient descent
Cited in
(8)- A control theoretic framework for adaptive gradient optimizers
- Conjugate-gradient-based Adam for nonconvex stochastic optimization and its application to deep learning
- A modification of adaptive moment estimation (Adam) for machine learning
- AdaLo: adaptive learning rate optimizer with loss for classification
- Automatic, dynamic, and nearly optimal learning rate specification via local quadratic approximation
- Convergence of the RMSProp deep learning method with penalty for nonconvex optimization
- Stochastic perturbation of subgradient algorithm for nonconvex deep neural networks
- Taming Neural Networks with TUSLA: Nonconvex Learning via Adaptive Stochastic Gradient Langevin Algorithms
This page was built for publication: Adaptive methods using element-wise \(p\)th power of stochastic gradient for nonconvex optimization in deep neural networks
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5084579)