Fit without fear: remarkable mathematical phenomena of deep learning through the prism of interpolation
From MaRDI portal
Publication:5887828
DOI10.1017/S0962492921000039MaRDI QIDQ5887828
Publication date: 14 April 2023
Published in: Acta Numerica (Search for Journal in Brave)
Full work available at URL: https://arxiv.org/abs/2105.14368
Related Items (14)
An adaptively weighted stochastic gradient MCMC algorithm for Monte Carlo simulation and global optimization ⋮ Stopping rules for gradient methods for non-convex problems with additive noise in gradient ⋮ Overparameterized maximum likelihood tests for detection of sparse vectors ⋮ Tractability from overparametrization: the example of the negative perceptron ⋮ Recent Theoretical Advances in Non-Convex Optimization ⋮ Benign overfitting and adaptive nonparametric regression ⋮ A moment-matching approach to testable learning and a new characterization of Rademacher complexity ⋮ A data-dependent approach for high-dimensional (robust) Wasserstein alignment ⋮ Training adaptive reconstruction networks for blind inverse problems ⋮ Convergence analysis for over-parameterized deep learning ⋮ New equivalences between interpolation and SVMs: kernels and structured features ⋮ Double data piling: a high-dimensional solution for asymptotically perfect multi-category classification ⋮ Differentiability in unrolled training of neural physics simulators on transient dynamics ⋮ The energy landscape of the Kuramoto model in random geometric graphs in a circle
Uses Software
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Comment on: Boosting algorithms: regularization, prediction and model fitting
- A randomized Kaczmarz algorithm with exponential convergence
- Occam's razor
- Gauss and the invention of least squares
- The Hilbert kernel regression estimate.
- A decision-theoretic generalization of on-line learning and an application to boosting
- Boosting the margin: a new explanation for the effectiveness of voting methods
- A distribution-free theory of nonparametric regression
- Surprises in high-dimensional ridgeless least squares interpolation
- Just interpolate: kernel ``ridgeless regression can generalize
- On early stopping in gradient descent learning
- 10.1162/153244303321897690
- Deep double descent: where bigger models and more data hurt*
- When do neural networks outperform kernel methods?*
- Two Models of Double Descent for Weak Features
- The Generalization Error of Random Features Regression: Precise Asymptotics and the Double Descent Curve
- Overparameterized neural networks implement associative memory
- Benign overfitting in linear regression
- Reconciling modern machine-learning practice and the classical bias–variance trade-off
- Advanced Lectures on Machine Learning
- Learning Theory
- Nearest neighbor pattern classification
- A Correspondence Between Bayesian Estimation on Stochastic Processes and Smoothing by Splines
- Wide neural networks of any depth evolve as linear models under gradient descent *
- A jamming transition from under- to over-parametrization affects generalization in deep learning
- Scattered Data Approximation
- The elements of statistical learning. Data mining, inference, and prediction
- Stochastic gradient descent, weighted sampling, and the randomized Kaczmarz algorithm
This page was built for publication: Fit without fear: remarkable mathematical phenomena of deep learning through the prism of interpolation