Survey of unstable gradients in deep neural network training
From MaRDI portal
Publication:4624686
Recommendations
- Optimization for deep learning: an overview
- Non-convergence of stochastic gradient descent in the training of deep neural networks
- Gradient explosion free algorithm for training recurrent neural networks
- Why does large batch training result in poor generalization? A comprehensive explanation and a better strategy from the viewpoint of stochastic optimization
- Research progress on batch normalization of deep learning and its related algorithms
Cited in
(4)
This page was built for publication: Survey of unstable gradients in deep neural network training
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q4624686)