Nonparametric stochastic approximation with large step-sizes
From MaRDI portal
Abstract: We consider the random-design least-squares regression problem within the reproducing kernel Hilbert space (RKHS) framework. Given a stream of independent and identically distributed input/output data, we aim to learn a regression function within an RKHS , even if the optimal predictor (i.e., the conditional expectation) is not in . In a stochastic approximation framework where the estimator is updated after each observation, we show that the averaged unregularized least-mean-square algorithm (a form of stochastic gradient), given a sufficient large step-size, attains optimal rates of convergence for a variety of regimes for the smoothnesses of the optimal prediction function and the functions in .
Recommendations
- On convergence of kernel learning estimators
- Optimal rates for spectral algorithms with least-squares regression over Hilbert spaces
- Learning rates of least-square regularized regression
- Optimal learning rates for least squares regularized regression with unbounded sampling
- Harder, Better, Faster, Stronger Convergence Rates for Least-Squares Regression
Cites work
- scientific article; zbMATH DE number 45848 (Why is no real title available?)
- scientific article; zbMATH DE number 524360 (Why is no real title available?)
- scientific article; zbMATH DE number 977420 (Why is no real title available?)
- scientific article; zbMATH DE number 1950576 (Why is no real title available?)
- scientific article; zbMATH DE number 936298 (Why is no real title available?)
- scientific article; zbMATH DE number 5055767 (Why is no real title available?)
- scientific article; zbMATH DE number 3273551 (Why is no real title available?)
- scientific article; zbMATH DE number 3372755 (Why is no real title available?)
- scientific article; zbMATH DE number 961607 (Why is no real title available?)
- A Stochastic Approximation Method
- A new concentration result for regularized risk minimizers
- An introduction to support vector machines and other kernel-based learning methods.
- Best choices for regularization parameters in learning theory: on the bias-variance problem.
- Beyond the regret minimization barrier: optimal algorithms for stochastic strongly-convex optimization
- Boosting with early stopping: convergence and consistency
- Early stopping and non-parametric regression: an optimal data-dependent stopping rule
- Fast kernel classifiers with online and active learning
- Introduction to nonparametric estimation
- Learning theory estimates via integral operators and their approximations
- Model selection for regularized least-squares algorithm in learning theory
- Nonparametric stochastic approximation with large step-sizes
- On early stopping in gradient descent learning
- On the mathematical foundations of learning
- Online Learning as Stochastic Approximation of Regularization Paths: Optimality and Almost-Sure Convergence
- Online Learning with Kernels
- Online gradient descent learning algorithms
- Online learning and online convex optimization
- Optimal rates for the regularized least-squares algorithm
- Random design analysis of ridge regression
- Randomized Algorithms for Matrices and Data
- Robust Stochastic Approximation Approach to Stochastic Programming
- Some results on Tchebycheffian spline functions and stochastic processes
- The Forgetron: A Kernel-Based Perceptron on a Budget
- Theory of Reproducing Kernels
- Universal kernels
- Universality, Characteristic Kernels and RKHS Embedding of Measures
Cited in
(47)- Optimal indirect estimation for linear inverse problems with discretely sampled functional data
- Differentially private SGD with non-smooth losses
- Graph-dependent implicit regularisation for distributed stochastic subgradient descent
- Nonparametric stochastic approximation with large step-sizes
- Complexity Analysis of stochastic gradient methods for PDE-constrained optimal Control Problems with uncertain parameters
- Rates of convergence of randomized Kaczmarz algorithms in Hilbert spaces
- Unregularized online algorithms with varying Gaussians
- Convergence of unregularized online learning algorithms
- Bridging the gap between constant step size stochastic gradient descent and Markov chains
- Dimension independent excess risk by stochastic gradient descent
- Distribution-free robust linear regression
- Distributed SGD in overparametrized linear regression
- A sieve stochastic gradient descent estimator for online nonparametric regression in Sobolev ellipsoids
- Stochastic subspace correction methods and fault tolerance
- Consistent change-point detection with kernels
- scientific article; zbMATH DE number 7370542 (Why is no real title available?)
- On the Convergence of Stochastic Gradient Descent for Nonlinear Ill-Posed Problems
- Optimality of robust online learning
- Efficient mini-batch stochastic gradient descent with centroidal Voronoi tessellation for PDE-constrained optimization under uncertainty
- Harder, Better, Faster, Stronger Convergence Rates for Least-Squares Regression
- Regularization: From Inverse Problems to Large-Scale Machine Learning
- Stochastic subspace correction in Hilbert space
- An analysis of stochastic variance reduced gradient for linear inverse problems *
- Sparse online regression algorithm with insensitive loss functions
- Convergence rates of gradient methods for convex optimization in the space of measures
- A Markov Chain Theory Approach to Characterizing the Minimax Optimality of Stochastic Gradient Descent (for Least Squares)
- Differentially private SGD with random features
- On the regularizing property of stochastic gradient descent
- Ensemble Kalman inversion: a derivative-free technique for machine learning tasks
- Online regularized learning algorithm for functional data
- From inexact optimization to learning via gradient concentration
- Parallelizing stochastic gradient descent for least squares regression: mini-batching, averaging, and model misspecification
- Uncertainty quantification for stochastic approximation limits using chaos expansion
- An elementary analysis of ridge regression with random design
- An Online Projection Estimator for Nonparametric Regression in Reproducing Kernel Hilbert Spaces
- Biparametric identification for a free boundary of ductal carcinoma in situ
- A kernel multiple change-point algorithm via model selection
- Streaming kernel regression with provably adaptive mean, variance, and regularization
- Parsimonious online learning with kernels via sparse projections in function space
- High probability bounds for stochastic subgradient schemes with heavy tailed noise]
- Approximate maximum likelihood estimation for population genetic inference
- Optimal rates for multi-pass stochastic gradient methods
- Fast and strong convergence of online learning algorithms
- Ivanov-regularised least-squares estimators over large RKHSs and their interpolation spaces
- scientific article; zbMATH DE number 7306853 (Why is no real title available?)
- New efficient algorithms for multiple change-point detection with reproducing kernels
- Capacity dependent analysis for functional online learning algorithms
This page was built for publication: Nonparametric stochastic approximation with large step-sizes
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q309706)