Fit without fear: remarkable mathematical phenomena of deep learning through the prism of interpolation
From MaRDI portal
Publication:5887828
DOI10.1017/S0962492921000039MaRDI QIDQ5887828FDOQ5887828
Publication date: 14 April 2023
Published in: Acta Numerica (Search for Journal in Brave)
Full work available at URL: https://arxiv.org/abs/2105.14368
Cites Work
- Title not available (Why is that?)
- Title not available (Why is that?)
- Title not available (Why is that?)
- Title not available (Why is that?)
- Title not available (Why is that?)
- Title not available (Why is that?)
- Title not available (Why is that?)
- Title not available (Why is that?)
- Title not available (Why is that?)
- A decision-theoretic generalization of on-line learning and an application to boosting
- The elements of statistical learning. Data mining, inference, and prediction
- A randomized Kaczmarz algorithm with exponential convergence
- Nearest neighbor pattern classification
- Boosting the margin: a new explanation for the effectiveness of voting methods
- A Correspondence Between Bayesian Estimation on Stochastic Processes and Smoothing by Splines
- On early stopping in gradient descent learning
- A distribution-free theory of nonparametric regression
- 10.1162/153244303321897690
- Advanced Lectures on Machine Learning
- Stochastic gradient descent, weighted sampling, and the randomized Kaczmarz algorithm
- Occam's razor
- Comment on: Boosting algorithms: regularization, prediction and model fitting
- Convex optimization: algorithms and complexity
- Gauss and the invention of least squares
- Learning Theory
- Scattered Data Approximation
- The Hilbert kernel regression estimate.
- Just interpolate: kernel ``ridgeless regression can generalize
- Deep double descent: where bigger models and more data hurt*
- Reconciling modern machine-learning practice and the classical bias–variance trade-off
- Benign overfitting in linear regression
- Wide neural networks of any depth evolve as linear models under gradient descent *
- Surprises in high-dimensional ridgeless least squares interpolation
- Two Models of Double Descent for Weak Features
- The Generalization Error of Random Features Regression: Precise Asymptotics and the Double Descent Curve
- When do neural networks outperform kernel methods?*
- A jamming transition from under- to over-parametrization affects generalization in deep learning
- Overparameterized neural networks implement associative memory
Cited In (15)
- Overparameterized maximum likelihood tests for detection of sparse vectors
- Benign overfitting and adaptive nonparametric regression
- A data-dependent approach for high-dimensional (robust) Wasserstein alignment
- Stopping rules for gradient methods for non-convex problems with additive noise in gradient
- Training adaptive reconstruction networks for blind inverse problems
- Convergence analysis for over-parameterized deep learning
- New equivalences between interpolation and SVMs: kernels and structured features
- Tractability from overparametrization: the example of the negative perceptron
- Double data piling: a high-dimensional solution for asymptotically perfect multi-category classification
- A moment-matching approach to testable learning and a new characterization of Rademacher complexity
- The Modern Mathematics of Deep Learning
- An adaptively weighted stochastic gradient MCMC algorithm for Monte Carlo simulation and global optimization
- Recent Theoretical Advances in Non-Convex Optimization
- Differentiability in unrolled training of neural physics simulators on transient dynamics
- The energy landscape of the Kuramoto model in random geometric graphs in a circle
Uses Software
Recommendations
This page was built for publication: Fit without fear: remarkable mathematical phenomena of deep learning through the prism of interpolation
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5887828)