Reconciling modern machine-learning practice and the classical bias–variance trade-off

From MaRDI portal
Revision as of 17:52, 8 February 2024 by Import240129110113 (talk | contribs) (Created automatically from import240129110113)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Publication:5218544

DOI10.1073/pnas.1903070116zbMath1433.68325arXiv1812.11118OpenAlexW2963518130WikidataQ92153099 ScholiaQ92153099MaRDI QIDQ5218544

Soumik Mandal, Mikhail Belkin, Daniel Hsu, Siyuan Ma

Publication date: 4 March 2020

Published in: Proceedings of the National Academy of Sciences (Search for Journal in Brave)

Full work available at URL: https://arxiv.org/abs/1812.11118




Related Items (93)

Mehler’s Formula, Branching Process, and Compositional Kernels of Deep Neural NetworksDouble Double Descent: On Generalization Errors in Transfer Learning between Linear Regression TasksDeep learning: a statistical viewpointFit without fear: remarkable mathematical phenomena of deep learning through the prism of interpolationMachine learning from a continuous viewpoint. IDeep learning for inverse problems. Abstracts from the workshop held March 7--13, 2021 (hybrid meeting)Surprises in high-dimensional ridgeless least squares interpolationCounterfactual inference with latent variable and its application in mental health careGeneralization error of random feature and kernel methods: hypercontractivity and kernel matrix concentrationLoss landscapes and optimization in over-parameterized non-linear systems and neural networksNeural network training using \(\ell_1\)-regularization and bi-fidelity dataLearning curves of generic features maps for realistic datasets with a teacher-student model*Deep networks on toroids: removing symmetries reveals the structure of flat regions in the landscape geometry*A precise high-dimensional asymptotic theory for boosting and minimum-\(\ell_1\)-norm interpolated classifiersDimensionality Reduction, Regularization, and Generalization in Overparameterized RegressionsBinary Classification of Gaussian Mixtures: Abundance of Support Vectors, Benign Overfitting, and RegularizationPrevalence of neural collapse during the terminal phase of deep learning trainingOverparameterized neural networks implement associative memoryBenign overfitting in linear regressionThe inverse variance–flatness relation in stochastic gradient descent is critical for finding flat minimaOn Transversality of Bent Hyperplane Arrangements and the Topological Expressiveness of ReLU Neural NetworksScientific machine learning through physics-informed neural networks: where we are and what's nextOverparameterization and Generalization Error: Weighted Trigonometric InterpolationBenefit of Interpolation in Nearest Neighbor AlgorithmsOn the Benefit of Width for Neural Networks: Disappearance of BasinsTraining Neural Networks as Learning Data-adaptive Kernels: Provable Representation and Approximation BenefitsHARFE: hard-ridge random feature expansionSCORE: approximating curvature information under self-concordant regularizationDeep empirical risk minimization in finance: Looking into the futureHigh dimensional binary classification under label shift: phase transition and regularizationLarge-dimensional random matrix theory and its applications in deep learning and wireless communicationsOn the Inconsistency of Kernel Ridgeless Regression in Fixed DimensionsFree dynamics of feature learning processesA Universal Trade-off Between the Model Size, Test Loss, and Training Loss of Linear PredictorsReliable extrapolation of deep neural operators informed by physics or sparse observationsLearning algebraic models of quantum entanglementRe-thinking high-dimensional mathematical statistics. Abstracts from the workshop held May 15--21, 2022Unnamed ItemIs deep learning a useful tool for the pure mathematician?Also for \(k\)-means: more data does not imply better performanceRandom neural networks in the infinite width limit as Gaussian processesStability of the scattering transform for deformations with minimal regularityHigh-Dimensional Analysis of Double Descent for Linear Regression with Random ProjectionsOn the robustness of sparse counterfactual explanations to adverse perturbationsOn the influence of over-parameterization in manifold based surrogates and deep neural operatorsAn instance-dependent simulation framework for learning with label noiseOn lower bounds for the bias-variance trade-offBenign Overfitting and Noisy FeaturesLearning ability of interpolating deep convolutional neural networksThe mathematics of artificial intelligenceUnnamed ItemOn the properties of bias-variance decomposition for kNN regressionDiscussion of: ``Nonparametric regression using deep neural networks with ReLU activation functionOptimization for deep learning: an overviewLandscape and training regimes in deep learningOver-parametrized deep neural networks minimizing the empirical risk do not generalize wellA statistician teaches deep learningUnnamed ItemUnnamed ItemUnnamed ItemUnnamed ItemShallow neural networks for fluid flow reconstruction with limited sensorsA generic physics-informed neural network-based constitutive model for soft biological tissuesA selective overview of deep learningLinearized two-layers neural networks in high dimensionThe Random Feature Model for Input-Output Maps between Banach SpacesHigh-dimensional dynamics of generalization error in neural networksGeneralization Error of Minimum Weighted Norm and Kernel InterpolationUnnamed ItemUnnamed ItemDimension independent excess risk by stochastic gradient descentImplicit Regularization and Momentum Algorithms in Nonlinearly Parameterized Adaptive Control and PredictionPrecise statistical analysis of classification accuracies for adversarial trainingOn the robustness of minimum norm interpolators and regularized empirical risk minimizersScaling description of generalization with number of parameters in deep learningA Multi-resolution Theory for Approximating Infinite-p-Zero-n: Transitional Inference, Individualized Predictions, and a World Without Bias-Variance TradeoffLarge scale analysis of generalization error in learning using margin based classification methodsUnnamed ItemUnnamed ItemUnnamed ItemUnnamed ItemAdaBoost and robust one-bit compressed sensingUnnamed ItemA Unifying Tutorial on Approximate Message PassingBayesian learning via neural Schrödinger-Föllmer flowsUnderstanding neural networks with reproducing kernel Banach spacesThe interpolation phase transition in neural networks: memorization and generalization under lazy trainingA sieve stochastic gradient descent estimator for online nonparametric regression in Sobolev ellipsoidsA random matrix analysis of random Fourier features: beyond the Gaussian kernel, a precise phase transition, and the corresponding double descent*Generalisation error in learning with random features and the hidden manifold model*For interpolating kernel machines, minimizing the norm of the ERM solution maximizes stabilityTwo Models of Double Descent for Weak FeaturesPrediction errors for penalized regressions based on generalized approximate message passing






This page was built for publication: Reconciling modern machine-learning practice and the classical bias–variance trade-off