Adaptive methods using element-wise pth power of stochastic gradient for nonconvex optimization in deep neural networks
From MaRDI portal
Publication:5084579
zbMATH Open1492.90110MaRDI QIDQ5084579FDOQ5084579
Authors: Kanako Shimoyama, Hideaki Iiduka
Publication date: 27 June 2022
Full work available at URL: http://www.ybook.co.jp/online2/oplna/vol7/p317.html
Recommendations
- Conjugate-gradient-based Adam for nonconvex stochastic optimization and its application to deep learning
- A heuristic adaptive fast gradient method in stochastic optimization problems
- An adaptive gradient method with energy and momentum
- Convergence of the RMSProp deep learning method with penalty for nonconvex optimization
- Adaptive sampling for incremental optimization using stochastic gradient descent
Numerical mathematical programming methods (65K05) Nonconvex programming, global optimization (90C26) Stochastic programming (90C15) Neural networks for/in biological studies, artificial life and related topics (92B20)
Cited In (4)
- Automatic, dynamic, and nearly optimal learning rate specification via local quadratic approximation
- Convergence of the RMSProp deep learning method with penalty for nonconvex optimization
- Taming Neural Networks with TUSLA: Nonconvex Learning via Adaptive Stochastic Gradient Langevin Algorithms
- Conjugate-gradient-based Adam for nonconvex stochastic optimization and its application to deep learning
This page was built for publication: Adaptive methods using element-wise \(p\)th power of stochastic gradient for nonconvex optimization in deep neural networks
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5084579)