A proof of convergence for gradient descent in the training of artificial neural networks for constant target functions
DOI10.1016/j.jco.2022.101646zbMath1502.65037arXiv2102.09924OpenAlexW3132264265WikidataQ113871711 ScholiaQ113871711MaRDI QIDQ2145074
Patrick Cheridito, Adrian Riekert, Florian Rossmannek, Arnulf Jentzen
Publication date: 17 June 2022
Published in: Journal of Complexity (Search for Journal in Brave)
Full work available at URL: https://arxiv.org/abs/2102.09924
nonsmooth optimizationnonconvex optimizationgradient methodsartificial neural networksmachine learning
Artificial neural networks and deep learning (68T07) Numerical optimization and variational techniques (65K10) Approximation by other special function classes (41A30)
Related Items (4)
Cites Work
- Unnamed Item
- Unnamed Item
- Non-convergence of stochastic gradient descent in the training of deep neural networks
- Gradient descent optimizes over-parameterized deep ReLU networks
- A comparative analysis of optimization and generalization properties of two-layer neural network and random feature models under gradient descent dynamics
- Lower error bounds for the stochastic gradient descent optimization algorithm: sharp convergence rates for slowly and fast decaying learning rates
- Strong error analysis for stochastic gradient descent optimization algorithms
- Full error analysis for the training of deep neural networks
- Dying ReLU and Initialization: Theory and Numerical Examples
- Breaking the Curse of Dimensionality with Convex Neural Networks
This page was built for publication: A proof of convergence for gradient descent in the training of artificial neural networks for constant target functions