A Universal Trade-off Between the Model Size, Test Loss, and Training Loss of Linear Predictors
From MaRDI portal
Publication:6090836
DOI10.1137/22m1540302arXiv2207.11621OpenAlexW4388538746MaRDI QIDQ6090836
Unnamed Author, Mikhail Belkin
Publication date: 20 November 2023
Published in: SIAM Journal on Mathematics of Data Science (Search for Journal in Brave)
Full work available at URL: https://arxiv.org/abs/2207.11621
Cites Work
- Unnamed Item
- The Hilbert kernel regression estimate.
- High-dimensional asymptotics of prediction: ridge regression and classification
- Surprises in high-dimensional ridgeless least squares interpolation
- Generalization error of random feature and kernel methods: hypercontractivity and kernel matrix concentration
- Just interpolate: kernel ``ridgeless regression can generalize
- Support Vector Machines
- Two Models of Double Descent for Weak Features
- The Generalization Error of Random Features Regression: Precise Asymptotics and the Double Descent Curve
- Benign overfitting in linear regression
- Does learning require memorization? a short tale about a long tail
- Reconciling modern machine-learning practice and the classical bias–variance trade-off
- Introduction to nonparametric estimation
- Ridge regression and asymptotic minimax estimation over spheres of growing dimension
- When is memorization of irrelevant training data necessary for high-accuracy learning?
This page was built for publication: A Universal Trade-off Between the Model Size, Test Loss, and Training Loss of Linear Predictors