Just interpolate: kernel ``ridgeless regression can generalize
DOI10.1214/19-AOS1849zbMATH Open1453.68155arXiv1808.00387OpenAlexW3104969455MaRDI QIDQ2196223FDOQ2196223
Authors: Tengyuan Liang, Alexander Rakhlin
Publication date: 28 August 2020
Published in: The Annals of Statistics (Search for Journal in Brave)
Full work available at URL: https://arxiv.org/abs/1808.00387
Recommendations
- Generalization error of minimum weighted norm and kernel interpolation
- Surprises in high-dimensional ridgeless least squares interpolation
- Benign overfitting in linear regression
- Benefit of Interpolation in Nearest Neighbor Algorithms
- Overparameterization and generalization error: weighted trigonometric interpolation
kernel methodsreproducing kernel Hilbert spaceshigh dimensionalityimplicit regularizationdata-dependent boundsminimum-norm interpolationspectral decay
Nonparametric regression and quantile regression (62G08) Learning and adaptive systems in artificial intelligence (68T05) Hilbert spaces with reproducing kernels (= (proper) functional Hilbert spaces, including de Branges-Rovnyak and other structured spaces) (46E22)
Cites Work
- Title not available (Why is that?)
- Regularization networks and support vector machines
- Generalized Cross-Validation as a Method for Choosing a Good Ridge Parameter
- An introduction to support vector machines and other kernel-based learning methods.
- Kernels for vector-valued functions: a review
- Title not available (Why is that?)
- On early stopping in gradient descent learning
- Title not available (Why is that?)
- A distribution-free theory of nonparametric regression
- Title not available (Why is that?)
- On the limit of the largest eigenvalue of the large dimensional sample covariance matrix
- Optimal rates for the regularized least-squares algorithm
- 10.1162/153244303321897690
- The origins of kriging
- Best choices for regularization parameters in learning theory: on the bias-variance problem.
- Model selection for regularized least-squares algorithm in learning theory
- The spectrum of kernel random matrices
- Kernel Ridge Regression
- Learning Theory
Cited In (45)
- Overparameterization and Generalization Error: Weighted Trigonometric Interpolation
- On the Inconsistency of Kernel Ridgeless Regression in Fixed Dimensions
- The interpolation phase transition in neural networks: memorization and generalization under lazy training
- Generalization error of random feature and kernel methods: hypercontractivity and kernel matrix concentration
- Communication-efficient distributed estimator for generalized linear models with a diverging number of covariates
- Diversity Sampling is an Implicit Regularization for Kernel Methods
- A random matrix analysis of random Fourier features: beyond the Gaussian kernel, a precise phase transition, and the corresponding double descent*
- A Unifying Tutorial on Approximate Message Passing
- Training Neural Networks as Learning Data-adaptive Kernels: Provable Representation and Approximation Benefits
- Canonical thresholding for nonsparse high-dimensional linear regression
- A Multi-resolution Theory for Approximating Infinite-p-Zero-n: Transitional Inference, Individualized Predictions, and a World Without Bias-Variance Tradeoff
- Benign overfitting and adaptive nonparametric regression
- Multilevel Fine-Tuning: Closing Generalization Gaps in Approximation of Solution Maps under a Limited Budget for Training Data
- A sieve stochastic gradient descent estimator for online nonparametric regression in Sobolev ellipsoids
- Title not available (Why is that?)
- Deformed semicircle law and concentration of nonlinear random matrices for ultra-wide neural networks
- Fit without fear: remarkable mathematical phenomena of deep learning through the prism of interpolation
- Theoretical issues in deep networks
- For interpolating kernel machines, minimizing the norm of the ERM solution maximizes stability
- HARFE: hard-ridge random feature expansion
- Generalization Error of Minimum Weighted Norm and Kernel Interpolation
- Deep Neural Networks, Generic Universal Interpolation, and Controlled ODEs
- Mehler’s Formula, Branching Process, and Compositional Kernels of Deep Neural Networks
- Convergence analysis for over-parameterized deep learning
- Title not available (Why is that?)
- Benign Overfitting and Noisy Features
- New equivalences between interpolation and SVMs: kernels and structured features
- Kernel approximation: from regression to interpolation
- Title not available (Why is that?)
- Tractability from overparametrization: the example of the negative perceptron
- Benign overfitting in linear regression
- Improved complexities for stochastic conditional gradient methods under interpolation-like conditions
- A Universal Trade-off Between the Model Size, Test Loss, and Training Loss of Linear Predictors
- On the proliferation of support vectors in high dimensions*
- SVRG meets AdaGrad: painless variance reduction
- Learning from non-random data in Hilbert spaces: an optimal recovery perspective
- Deep networks for system identification: a survey
- Deep learning: a statistical viewpoint
- Title not available (Why is that?)
- Title not available (Why is that?)
- Title not available (Why is that?)
- Surprises in high-dimensional ridgeless least squares interpolation
- On the robustness of minimum norm interpolators and regularized empirical risk minimizers
- Linearized two-layers neural networks in high dimension
- A precise high-dimensional asymptotic theory for boosting and minimum-\(\ell_1\)-norm interpolated classifiers
Uses Software
This page was built for publication: Just interpolate: kernel ``ridgeless regression can generalize
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2196223)