Pages that link to "Item:Q2183586"
From MaRDI portal
The following pages link to Gradient descent optimizes over-parameterized deep ReLU networks (Q2183586):
Displaying 44 items.
- Non-convergence of stochastic gradient descent in the training of deep neural networks (Q2034567) (← links)
- Linearized two-layers neural networks in high dimension (Q2039801) (← links)
- Gradient convergence of deep learning-based numerical methods for BSDEs (Q2044106) (← links)
- Normalization effects on shallow neural networks and related asymptotic expansions (Q2072629) (← links)
- Stabilize deep ResNet with a sharp scaling factor \(\tau\) (Q2102389) (← links)
- The interpolation phase transition in neural networks: memorization and generalization under lazy training (Q2105197) (← links)
- Provably training overparameterized neural network classifiers with non-convex constraints (Q2106783) (← links)
- Surprises in high-dimensional ridgeless least squares interpolation (Q2131262) (← links)
- Generalization error of random feature and kernel methods: hypercontractivity and kernel matrix concentration (Q2134105) (← links)
- Loss landscapes and optimization in over-parameterized non-linear systems and neural networks (Q2134108) (← links)
- A proof of convergence for gradient descent in the training of artificial neural networks for constant target functions (Q2145074) (← links)
- A proof of convergence for stochastic gradient descent in the training of artificial neural networks with ReLU activation for constant target functions (Q2167333) (← links)
- Gradient descent optimizes over-parameterized deep ReLU networks (Q2183586) (← links)
- A comparative analysis of optimization and generalization properties of two-layer neural network and random feature models under gradient descent dynamics (Q2197845) (← links)
- Optimization for deep learning: an overview (Q2218095) (← links)
- Towards interpreting deep neural networks via layer behavior understanding (Q2673336) (← links)
- Growing axons: greedy learning of neural networks with application to function approximation (Q2689211) (← links)
- Greedy training algorithms for neural networks and applications to PDEs (Q2699382) (← links)
- Memory Capacity of Neural Networks with Threshold and Rectified Linear Unit Activations (Q5037553) (← links)
- Effects of depth, width, and initialization: A convergence analysis of layer-wise training for deep linear neural networks (Q5037872) (← links)
- Revisiting Landscape Analysis in Deep Neural Networks: Eliminating Decreasing Paths to Infinity (Q5051381) (← links)
- (Q5054655) (← links)
- Particle dual averaging: optimization of mean field neural network with global convergence rate analysis* (Q5055425) (← links)
- Benign overfitting in linear regression (Q5073215) (← links)
- Full error analysis for the training of deep neural networks (Q5083408) (← links)
- On the Benefit of Width for Neural Networks: Disappearance of Basins (Q5097010) (← links)
- Plateau Phenomenon in Gradient Descent Training of RELU Networks: Explanation, Quantification, and Avoidance (Q5157837) (← links)
- (Q5159434) (← links)
- Every Local Minimum Value Is the Global Minimum Value of Induced Model in Nonconvex Machine Learning (Q5214402) (← links)
- On the Effect of the Activation Function on the Distribution of Hidden Nodes in a Deep Network (Q5214413) (← links)
- Wide neural networks of any depth evolve as linear models under gradient descent <sup>*</sup> (Q5857449) (← links)
- Dynamics of stochastic gradient descent for two-layer neural networks in the teacher–student setup* (Q5857458) (← links)
- Suboptimal Local Minima Exist for Wide Neural Networks with Smooth Activations (Q5870356) (← links)
- Deep learning: a statistical viewpoint (Q5887827) (← links)
- Deep learning in random neural fields: numerical experiments via neural tangent kernel (Q6053432) (← links)
- Non-differentiable saddle points and sub-optimal local minima exist for deep ReLU networks (Q6055135) (← links)
- Black holes and the loss landscape in machine learning (Q6061784) (← links)
- A rigorous framework for the mean field limit of multilayer neural networks (Q6062704) (← links)
- Overall error analysis for the training of deep neural networks via stochastic gradient descent with random initialisation (Q6107984) (← links)
- Convergence rates for shallow neural networks learned by gradient descent (Q6137712) (← links)
- On stochastic roundoff errors in gradient descent with low-precision computation (Q6150643) (← links)
- FedHD: communication-efficient federated learning from hybrid data (Q6177550) (← links)
- Normalization effects on deep neural networks (Q6194477) (← links)
- Value iteration for streaming data on a continuous space with gradient method in an RKHS (Q6488837) (← links)