Theoretical Insights Into the Optimization Landscape of Over-Parameterized Shallow Neural Networks
DOI10.1109/TIT.2018.2854560zbMATH Open1428.68255arXiv1707.04926OpenAlexW2963417959WikidataQ129563058 ScholiaQ129563058MaRDI QIDQ4615339FDOQ4615339
Jason D. Lee, A. Javanmard, Mahdi Soltanolkotabi
Publication date: 28 January 2019
Published in: IEEE Transactions on Information Theory (Search for Journal in Brave)
Full work available at URL: https://arxiv.org/abs/1707.04926
Learning and adaptive systems in artificial intelligence (68T05) Artificial neural networks and deep learning (68T07) Nonconvex programming, global optimization (90C26)
Cited In (36)
- Suboptimal Local Minima Exist for Wide Neural Networks with Smooth Activations
- Non-differentiable saddle points and sub-optimal local minima exist for deep ReLU networks
- Stable recovery of entangled weights: towards robust identification of deep neural networks from minimal samples
- The interpolation phase transition in neural networks: memorization and generalization under lazy training
- On the Benefit of Width for Neural Networks: Disappearance of Basins
- Spurious Valleys in Two-layer Neural Network Optimization Landscapes
- Gradient descent with random initialization: fast global convergence for nonconvex phase retrieval
- Analysis of a two-layer neural network via displacement convexity
- First-order methods almost always avoid strict saddle points
- Solving phase retrieval with random initial guess is nearly as good as by spectral initialization
- Implicit regularization in nonconvex statistical estimation: gradient descent converges linearly for phase retrieval, matrix completion, and blind deconvolution
- Uncertainty quantification of graph convolution neural network models of evolving processes
- Non-convergence of stochastic gradient descent in the training of deep neural networks
- Title not available (Why is that?)
- Loss landscapes and optimization in over-parameterized non-linear systems and neural networks
- On the Landscape of Synchronization Networks: A Perspective from Nonconvex Optimization
- Extending the Step-Size Restriction for Gradient Descent to Avoid Strict Saddle Points
- Neural ODEs as the deep limit of ResNets with constant weights
- On PDE Characterization of Smooth Hierarchical Functions Computed by Neural Networks
- Utility/privacy trade-off as regularized optimal transport
- Gradient descent provably escapes saddle points in the training of shallow ReLU networks
- Align, then memorise: the dynamics of learning with feedback alignment*
- Exploiting layerwise convexity of rectifier networks with sign constrained weights
- Title not available (Why is that?)
- Revisiting Landscape Analysis in Deep Neural Networks: Eliminating Decreasing Paths to Infinity
- Align, then memorise: the dynamics of learning with feedback alignment*
- Title not available (Why is that?)
- Simultaneous neural network approximation for smooth functions
- Recent Theoretical Advances in Non-Convex Optimization
- Symmetry \& critical points for a model shallow neural network
- Landscape analysis for shallow neural networks: complete classification of critical points for affine target functions
- Title not available (Why is that?)
- Principal Component Analysis by Optimization of Symmetric Functions has no Spurious Local Optima
- Applied harmonic analysis and data processing. Abstracts from the workshop held March 25--31, 2018
- The curse of overparametrization in adversarial training: precise analysis of robust generalization for random features regression
- Optimization for deep learning: an overview
This page was built for publication: Theoretical Insights Into the Optimization Landscape of Over-Parameterized Shallow Neural Networks
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q4615339)