Stochastic gradient descent, weighted sampling, and the randomized Kaczmarz algorithm
From MaRDI portal
Publication:5962728
Abstract: We obtain an improved finite-sample guarantee on the linear convergence of stochastic gradient descent for smooth and strongly convex objectives, improving from a quadratic dependence on the conditioning (where is a bound on the smoothness and on the strong convexity) to a linear dependence on . Furthermore, we show how reweighting the sampling distribution (i.e. importance sampling) is necessary in order to further improve convergence, and obtain a linear dependence in the average smoothness, dominating previous results. We also discuss importance sampling for SGD more broadly and show how it can improve convergence also in other scenarios. Our results are based on a connection we make between SGD and the randomized Kaczmarz algorithm, which allows us to transfer ideas between the separate bodies of literature studying each of the two methods. In particular, we recast the randomized Kaczmarz algorithm as an instance of SGD, and apply our results to prove its exponential convergence, but to the solution of a weighted least squares problem rather than the original least squares problem. We then present a modified Kaczmarz algorithm with partially biased sampling which does converge to the original least squares solution with the same exponential convergence rate.
Recommendations
- Batched Stochastic Gradient Descent with Weighted Sampling
- Non-asymptotic guarantees for sampling by stochastic gradient descent
- Stochastic semidefinite optimization using sampling methods
- Adaptive sampling for incremental optimization using stochastic gradient descent
- Sampling Gaussian distributions in Krylov spaces with conjugate gradients
- A stochastic multiple gradient descent algorithm
- Boosted sampling
- Subgradient and sampling algorithms for \(\ell_1\) regression
- The approximation of generalized stochastic gradients of random regular functions
Cites work
- scientific article; zbMATH DE number 4001918 (Why is no real title available?)
- scientific article; zbMATH DE number 1569104 (Why is no real title available?)
- A Kaczmarz-Kovarik algorithm for symmetric ill-conditioned matrices
- A Stochastic Approximation Method
- A fast Kaczmarz-Kovarik algorithm for consistent least-squares problems
- A randomized Kaczmarz algorithm with exponential convergence
- A statistical perspective on algorithmic leveraging
- Acceleration of randomized Kaczmarz method via the Johnson-Lindenstrauss lemma
- An improved approximation algorithm for the column subset selection problem
- Augmented \(\ell_1\) and nuclear-norm models with a globally linearly convergent algorithm
- Block-iterative methods for consistent and inconsistent linear equations
- Block-projections algorithms with blocks containing mutually orthogonal rows and columns
- Efficiency of coordinate descent methods on huge-scale optimization problems
- Extensions of block-projections methods with relaxation parameters to inconsistent and rank-deficient least-squares problems
- Fundamentals of Computerized Tomography
- Graph sparsification by effective resistances
- Improving CUR matrix decomposition and the Nyström approximation via adaptive sampling
- Introductory lectures on convex optimization. A basic course.
- Iteration complexity of randomized block-coordinate descent methods for minimizing a composite function
- Iterative algorithms for large partitioned linear systems, with applications to image reconstruction
- Large-scale machine learning with stochastic gradient descent
- On Kaczmarz's projection iteration as a direct solver for linear least squares problems
- On the acceleration of Kaczmarz's method for inconsistent linear systems
- Paved with good intentions: analysis of a randomized block Kaczmarz method
- Projection method for solving a singular system of linear equations and its applications
- Randomized Algorithms for Matrices and Data
- Randomized Kaczmarz solver for noisy linear systems
- Randomized extended Kaczmarz for solving least squares
- Revisiting the Nyström method for improved large-scale machine learning
- Robust Stochastic Approximation Approach to Stochastic Programming
- Sparse Legendre expansions via \(\ell_1\)-minimization
- Stable and Robust Sampling Strategies for Compressive Imaging
- Strong underrelaxation in Kaczmarz's method for inconsistent systems
- Two Algorithms Related to the Method of Steepest Descent
- Two-subspace projection method for coherent overdetermined systems
Cited in
(93)- Experience selection in deep reinforcement learning for control
- A new randomized Kaczmarz based kernel canonical correlation analysis algorithm with applications to information retrieval
- Multilevel stochastic gradient methods for nested composition optimization
- Weighted SGD for \(\ell_p\) regression with randomized preconditioning
- Data-driven algorithm selection and tuning in optimization and signal processing
- An accelerated variance reducing stochastic method with Douglas-Rachford splitting
- A sampling Kaczmarz-Motzkin algorithm for linear feasibility
- Convergence properties of the randomized extended Gauss-Seidel and Kaczmarz methods
- Auxiliary Gradient-Based Sampling Algorithms
- Stochastic gradients for large-scale tensor decomposition
- Randomized Kaczmarz for tensor linear systems
- Stochastic accelerated alternating direction method of multipliers with importance sampling
- Greed Works: An Improved Analysis of Sampling Kaczmarz--Motzkin
- Bridging the gap between constant step size stochastic gradient descent and Markov chains
- Subgradient and sampling algorithms for \(\ell_1\) regression
- A weighted randomized Kaczmarz method for solving linear systems
- Stochastic primal-dual coordinate method for regularized empirical risk minimization
- On data preconditioning for regularized loss minimization
- Nonlinear Kaczmarz algorithms and their convergence
- A Kaczmarz algorithm for sequences of projections, infinite products, and applications to frames in IFS \(L^2\) spaces
- A randomized algorithm for multivariate function approximation
- Stochastic gradient algorithm with random truncations
- An attention algorithm for solving large scale structured \(l_0\)-norm penalty estimation problems
- Stochastic modified equations and dynamics of stochastic gradient algorithms. I: Mathematical foundations
- Analysis of biased stochastic gradient descent using sequential semidefinite programs
- The Kaczmarz algorithm, row action methods, and statistical learning algorithms
- Fit without fear: remarkable mathematical phenomena of deep learning through the prism of interpolation
- Kalman-based stochastic gradient method with stop condition and insensitivity to conditioning
- Accelerating the distributed Kaczmarz algorithm by strong over-relaxation
- Sampling Gaussian distributions in Krylov spaces with conjugate gradients
- Randomized Kaczmarz Converges Along Small Singular Vectors
- The dual Kaczmarz algorithm
- Surrounding the solution of a linear system of equations from all sides
- Rows versus Columns: Randomized Kaczmarz or Gauss--Seidel for Ridge Regression
- Momentum and stochastic momentum for stochastic gradient, Newton, proximal point and subspace descent methods
- Randomized Kaczmarz method with adaptive stepsizes for inconsistent linear systems
- Enhancement of the Kaczmarz algorithm with projection adjustment
- Improved asynchronous parallel optimization analysis for stochastic incremental methods
- On Adaptive Sketch-and-Project for Solving Linear Systems
- Linear convergence of the randomized sparse Kaczmarz method
- Randomized Kaczmarz with averaging
- Batched Stochastic Gradient Descent with Weighted Sampling
- On the regularizing property of stochastic gradient descent
- A Kaczmarz Algorithm for Solving Tree Based Distributed Systems of Equations
- On block Gaussian sketching for the Kaczmarz method
- Restricted strong convexity and its applications to convergence analysis of gradient-type methods in convex optimization
- Non-asymptotic guarantees for sampling by stochastic gradient descent
- Stochastic Markov gradient descent and training low-bit neural networks
- scientific article; zbMATH DE number 6982318 (Why is no real title available?)
- Randomized extended average block Kaczmarz for solving least squares
- Worst-case recovery guarantees for least squares approximation using random samples
- Randomized numerical linear algebra: Foundations and algorithms
- Accelerating mini-batch SARAH by step size rules
- Parallelizing stochastic gradient descent for least squares regression: mini-batching, averaging, and model misspecification
- Minimizing finite sums with the stochastic average gradient
- Block Kaczmarz method with inequalities
- On the linear convergence of the stochastic gradient method with constant step-size
- Regularized Kaczmarz Algorithms for Tensor Recovery
- On variance reduction for stochastic smooth convex optimization with multiplicative noise
- Iterative Methods for Solving Factorized Linear Systems
- Approximate Solutions of Linear Systems at a Universal Rate
- Adaptive sampling for incremental optimization using stochastic gradient descent
- Stability of the Kaczmarz reconstruction for stationary sequences
- Unified analysis of stochastic gradient methods for composite convex and smooth optimization
- On the Kaczmarz methods based on relaxed greedy selection for solving matrix equation \(A X B = C\)
- Sampling Kaczmarz-Motzkin method for linear feasibility problems: generalization and acceleration
- On fast greedy block Kaczmarz methods for solving large consistent linear systems
- Randomized Kaczmarz with geometrically smoothed momentum
- Quantile-based iterative methods for corrupted systems of linear equations
- Fractional-based stochastic gradient algorithms for time-delayed ARX models
- Randomized Kaczmarz algorithm with averaging and block projection
- On adaptive stochastic heavy ball momentum for solving linear systems
- The greedy randomized extended Kaczmarz algorithm for noisy linear systems
- On stochastic Kaczmarz type methods for solving large scale systems of ill-posed equations
- Targeted deep learning: framework, methods, and applications
- A multivariate adaptive gradient algorithm with reduced tuning efforts
- Sampled limited memory methods for massive linear inverse problems
- A subspace constrained randomized Kaczmarz method for structure or external knowledge exploitation
- Improving sampling accuracy of stochastic gradient MCMC methods via non-uniform subsampling of gradients
- Importance sampling in signal processing applications
- Stochastic Steffensen method
- A fast non-monotone line search for stochastic gradient descent
- The accelerated tensor Kaczmarz algorithm with adaptive parameters for solving tensor systems
- Reconstructing the thermal phonon transmission coefficient at solid interfaces in the phonon transport equation
- Two stochastic optimization algorithms for convex optimization with fixed point constraints
- An Optimal Scheduled Learning Rate for a Randomized Kaczmarz Algorithm
- Max-affine regression via first-order methods
- Quasi-Newton methods for machine learning: forget the past, just sample
- Weighted SGD for \(\ell_p\) regression with randomized preconditioning
- A block-randomized stochastic method with importance sampling for CP tensor decomposition
- A stochastic variance reduced gradient method with adaptive step for stochastic optimization
- A semi-randomized Kaczmarz method with simple random sampling for large-scale linear systems
- scientific article; zbMATH DE number 7307490 (Why is no real title available?)
This page was built for publication: Stochastic gradient descent, weighted sampling, and the randomized Kaczmarz algorithm
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5962728)