Gradient descent on infinitely wide neural networks: global convergence and generalization
DOI10.4171/ICM2022/121arXiv2110.08084MaRDI QIDQ6200217FDOQ6200217
Authors: Francis Bach, Lénaïc Chizat
Publication date: 22 March 2024
Published in: International Congress of Mathematicians (Search for Journal in Brave)
Full work available at URL: https://arxiv.org/abs/2110.08084
Recommendations
- Wide neural networks of any depth evolve as linear models under gradient descent *
- Infinite-width limit of deep linear neural networks
- A comparative analysis of optimization and generalization properties of two-layer neural network and random feature models under gradient descent dynamics
- A mean field view of the landscape of two-layer neural networks
- Two-layer neural network on infinite-dimensional data: global optimization guarantee in the mean-field regime
Numerical optimization and variational techniques (65K10) Artificial neural networks and deep learning (68T07) Numerical methods based on nonlinear programming (49M37) Methods of reduced gradient type (90C52) Probabilistic metric spaces (54E70) Probabilistic methods in Banach space theory (46B09)
Cites Work
- Universal approximation bounds for superpositions of a sigmoidal function
- Optimal transport for applied mathematicians. Calculus of variations, PDEs, and modeling
- Title not available (Why is that?)
- Asymptotic Statistics
- Gradient flows in metric spaces and in the space of probability measures
- Deep learning
- Empirical margin distributions and bounding the generalization error of combined classifiers
- An Introduction to Numerical Analysis
- Convex optimization: algorithms and complexity
- Probabilistic representation and uniqueness results for measure-valued solutions of transport equations
- Title not available (Why is that?)
- Bounds on rates of variable-basis and neural-network approximation
- Lectures on convex optimization
- A mean field view of the landscape of two-layer neural networks
- Breaking the curse of dimensionality with convex neural networks
- Foundations of machine learning
- The implicit bias of gradient descent on separable data
- Mean field analysis of neural networks: a law of large numbers
Cited In (9)
- Infinite-width limit of deep linear neural networks
- A geometric approach of gradient descent algorithms in linear neural networks
- Concentration of measure and global optimization of Bayesian multilayer perceptron. I
- Phase diagram of stochastic gradient descent in high-dimensional two-layer neural networks
- Two-layer neural network on infinite-dimensional data: global optimization guarantee in the mean-field regime
- Convergence of deep convolutional neural networks
- Numerical solution of Poisson partial differential equation in high dimension using two-layer neural networks
- Approximation results for gradient flow trained shallow neural networks in \(1d\)
- Universality of gradient descent neural network training
This page was built for publication: Gradient descent on infinitely wide neural networks: global convergence and generalization
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6200217)