Just interpolate: kernel ``ridgeless regression can generalize
DOI10.1214/19-AOS1849zbMATH Open1453.68155arXiv1808.00387OpenAlexW3104969455MaRDI QIDQ2196223FDOQ2196223
Authors: Tengyuan Liang, Alexander Rakhlin
Publication date: 28 August 2020
Published in: The Annals of Statistics (Search for Journal in Brave)
Full work available at URL: https://arxiv.org/abs/1808.00387
Recommendations
- Generalization error of minimum weighted norm and kernel interpolation
- Surprises in high-dimensional ridgeless least squares interpolation
- Benign overfitting in linear regression
- Benefit of Interpolation in Nearest Neighbor Algorithms
- Overparameterization and generalization error: weighted trigonometric interpolation
kernel methodsreproducing kernel Hilbert spaceshigh dimensionalityimplicit regularizationdata-dependent boundsminimum-norm interpolationspectral decay
Nonparametric regression and quantile regression (62G08) Learning and adaptive systems in artificial intelligence (68T05) Hilbert spaces with reproducing kernels (= (proper) functional Hilbert spaces, including de Branges-Rovnyak and other structured spaces) (46E22)
Cites Work
- Scikit-learn: machine learning in Python
- Regularization networks and support vector machines
- Generalized Cross-Validation as a Method for Choosing a Good Ridge Parameter
- An introduction to support vector machines and other kernel-based learning methods.
- Kernels for vector-valued functions: a review
- Title not available (Why is that?)
- On early stopping in gradient descent learning
- Title not available (Why is that?)
- A distribution-free theory of nonparametric regression
- Title not available (Why is that?)
- On the limit of the largest eigenvalue of the large dimensional sample covariance matrix
- Optimal rates for the regularized least-squares algorithm
- 10.1162/153244303321897690
- The origins of kriging
- Best choices for regularization parameters in learning theory: on the bias-variance problem.
- Model selection for regularized least-squares algorithm in learning theory
- The spectrum of kernel random matrices
- Kernel ridge regression
- Learning Theory
Cited In (47)
- On the Inconsistency of Kernel Ridgeless Regression in Fixed Dimensions
- The interpolation phase transition in neural networks: memorization and generalization under lazy training
- Generalization error of random feature and kernel methods: hypercontractivity and kernel matrix concentration
- Communication-efficient distributed estimator for generalized linear models with a diverging number of covariates
- A random matrix analysis of random Fourier features: beyond the Gaussian kernel, a precise phase transition, and the corresponding double descent*
- A Unifying Tutorial on Approximate Message Passing
- Training Neural Networks as Learning Data-adaptive Kernels: Provable Representation and Approximation Benefits
- Canonical thresholding for nonsparse high-dimensional linear regression
- Benign overfitting and adaptive nonparametric regression
- Multilevel Fine-Tuning: Closing Generalization Gaps in Approximation of Solution Maps under a Limited Budget for Training Data
- Learning the mapping \(\mathbf{x}\mapsto \sum\limits_{i=1}^d x_i^2\): the cost of finding the needle in a haystack
- A sieve stochastic gradient descent estimator for online nonparametric regression in Sobolev ellipsoids
- Title not available (Why is that?)
- Deformed semicircle law and concentration of nonlinear random matrices for ultra-wide neural networks
- Binary classification of Gaussian mixtures: abundance of support vectors, benign overfitting, and regularization
- Fit without fear: remarkable mathematical phenomena of deep learning through the prism of interpolation
- Theoretical issues in deep networks
- For interpolating kernel machines, minimizing the norm of the ERM solution maximizes stability
- HARFE: hard-ridge random feature expansion
- Mehler’s Formula, Branching Process, and Compositional Kernels of Deep Neural Networks
- Convergence analysis for over-parameterized deep learning
- Title not available (Why is that?)
- Overparameterization and generalization error: weighted trigonometric interpolation
- A multi-resolution theory for approximating infinite-\(p\)-zero-\(n\): transitional inference, individualized predictions, and a world without bias-variance tradeoff
- Benign Overfitting and Noisy Features
- New equivalences between interpolation and SVMs: kernels and structured features
- Diversity sampling is an implicit regularization for kernel methods
- Kernel approximation: from regression to interpolation
- Title not available (Why is that?)
- Tractability from overparametrization: the example of the negative perceptron
- Benign overfitting in linear regression
- Improved complexities for stochastic conditional gradient methods under interpolation-like conditions
- A Universal Trade-off Between the Model Size, Test Loss, and Training Loss of Linear Predictors
- On the proliferation of support vectors in high dimensions*
- SVRG meets AdaGrad: painless variance reduction
- Learning from non-random data in Hilbert spaces: an optimal recovery perspective
- Deep networks for system identification: a survey
- Deep learning: a statistical viewpoint
- Generalization error of minimum weighted norm and kernel interpolation
- Deep neural networks, generic universal interpolation, and controlled ODEs
- Title not available (Why is that?)
- Title not available (Why is that?)
- Title not available (Why is that?)
- Surprises in high-dimensional ridgeless least squares interpolation
- On the robustness of minimum norm interpolators and regularized empirical risk minimizers
- Linearized two-layers neural networks in high dimension
- A precise high-dimensional asymptotic theory for boosting and minimum-\(\ell_1\)-norm interpolated classifiers
Uses Software
This page was built for publication: Just interpolate: kernel ``ridgeless regression can generalize
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2196223)