Stochastic gradient descent, weighted sampling, and the randomized Kaczmarz algorithm
From MaRDI portal
(Redirected from Publication:5962728)
Abstract: We obtain an improved finite-sample guarantee on the linear convergence of stochastic gradient descent for smooth and strongly convex objectives, improving from a quadratic dependence on the conditioning (where is a bound on the smoothness and on the strong convexity) to a linear dependence on . Furthermore, we show how reweighting the sampling distribution (i.e. importance sampling) is necessary in order to further improve convergence, and obtain a linear dependence in the average smoothness, dominating previous results. We also discuss importance sampling for SGD more broadly and show how it can improve convergence also in other scenarios. Our results are based on a connection we make between SGD and the randomized Kaczmarz algorithm, which allows us to transfer ideas between the separate bodies of literature studying each of the two methods. In particular, we recast the randomized Kaczmarz algorithm as an instance of SGD, and apply our results to prove its exponential convergence, but to the solution of a weighted least squares problem rather than the original least squares problem. We then present a modified Kaczmarz algorithm with partially biased sampling which does converge to the original least squares solution with the same exponential convergence rate.
Recommendations
- Batched Stochastic Gradient Descent with Weighted Sampling
- Non-asymptotic guarantees for sampling by stochastic gradient descent
- Stochastic semidefinite optimization using sampling methods
- Adaptive sampling for incremental optimization using stochastic gradient descent
- Sampling Gaussian distributions in Krylov spaces with conjugate gradients
- A stochastic multiple gradient descent algorithm
- Boosted sampling
- Subgradient and sampling algorithms for _1 regression
- The approximation of generalized stochastic gradients of random regular functions
Cites work
- scientific article; zbMATH DE number 4001918 (Why is no real title available?)
- scientific article; zbMATH DE number 1569104 (Why is no real title available?)
- A Kaczmarz-Kovarik algorithm for symmetric ill-conditioned matrices
- A Stochastic Approximation Method
- A fast Kaczmarz-Kovarik algorithm for consistent least-squares problems
- A randomized Kaczmarz algorithm with exponential convergence
- A statistical perspective on algorithmic leveraging
- Acceleration of randomized Kaczmarz method via the Johnson-Lindenstrauss lemma
- An improved approximation algorithm for the column subset selection problem
- Augmented _1 and nuclear-norm models with a globally linearly convergent algorithm
- Block-iterative methods for consistent and inconsistent linear equations
- Block-projections algorithms with blocks containing mutually orthogonal rows and columns
- Efficiency of coordinate descent methods on huge-scale optimization problems
- Extensions of block-projections methods with relaxation parameters to inconsistent and rank-deficient least-squares problems
- Fundamentals of Computerized Tomography
- Graph sparsification by effective resistances
- Improving CUR matrix decomposition and the Nyström approximation via adaptive sampling
- Introductory lectures on convex optimization. A basic course.
- Iteration complexity of randomized block-coordinate descent methods for minimizing a composite function
- Iterative algorithms for large partitioned linear systems, with applications to image reconstruction
- Large-scale machine learning with stochastic gradient descent
- On Kaczmarz's projection iteration as a direct solver for linear least squares problems
- On the acceleration of Kaczmarz's method for inconsistent linear systems
- Paved with good intentions: analysis of a randomized block Kaczmarz method
- Projection method for solving a singular system of linear equations and its applications
- Randomized Algorithms for Matrices and Data
- Randomized Kaczmarz solver for noisy linear systems
- Randomized extended Kaczmarz for solving least squares
- Revisiting the Nyström method for improved large-scale machine learning
- Robust Stochastic Approximation Approach to Stochastic Programming
- Sparse Legendre expansions via _1-minimization
- Stable and Robust Sampling Strategies for Compressive Imaging
- Strong underrelaxation in Kaczmarz's method for inconsistent systems
- Two Algorithms Related to the Method of Steepest Descent
- Two-subspace projection method for coherent overdetermined systems
Cited in
(only showing first 100 items - show all)- Analysis of biased stochastic gradient descent using sequential semidefinite programs
- Randomized iterative methods for generalized absolute value equations: solvability and error bounds
- Targeted deep learning: framework, methods, and applications
- Reconstructing the thermal phonon transmission coefficient at solid interfaces in the phonon transport equation
- Randomized Kaczmarz for tensor linear systems
- Convergence properties of the randomized extended Gauss-Seidel and Kaczmarz methods
- Linear convergence of the randomized sparse Kaczmarz method
- Adaptive sampling for incremental optimization using stochastic gradient descent
- An attention algorithm for solving large scale structured \(l_0\)-norm penalty estimation problems
- Rows versus Columns: Randomized Kaczmarz or Gauss--Seidel for Ridge Regression
- A block-randomized stochastic method with importance sampling for CP tensor decomposition
- Sampling Gaussian distributions in Krylov spaces with conjugate gradients
- Non-asymptotic guarantees for sampling by stochastic gradient descent
- Fractional-based stochastic gradient algorithms for time-delayed ARX models
- On a nonlinear fast deterministic block Kaczmarz method for solving nonlinear equations
- Randomized Kaczmarz with averaging
- A new randomized Kaczmarz based kernel canonical correlation analysis algorithm with applications to information retrieval
- scientific article; zbMATH DE number 6982318 (Why is no real title available?)
- An accelerated variance reducing stochastic method with Douglas-Rachford splitting
- Randomized extended average block Kaczmarz for solving least squares
- Stochastic gradients for large-scale tensor decomposition
- Surrounding the solution of a linear system of equations from all sides
- A stochastic variance reduced gradient method with adaptive step for stochastic optimization
- A sampling Kaczmarz-Motzkin algorithm for linear feasibility
- On the regularizing property of stochastic gradient descent
- scientific article; zbMATH DE number 7307490 (Why is no real title available?)
- Enhancement of the Kaczmarz algorithm with projection adjustment
- On variance reduction for stochastic smooth convex optimization with multiplicative noise
- On adaptive stochastic heavy ball momentum for solving linear systems
- A fast non-monotone line search for stochastic gradient descent
- Adaptive client sampling in federated learning via online learning with bandit feedback
- The accelerated tensor Kaczmarz algorithm with adaptive parameters for solving tensor systems
- Accelerating the distributed Kaczmarz algorithm by strong over-relaxation
- Unified analysis of stochastic gradient methods for composite convex and smooth optimization
- A weighted randomized Kaczmarz method for solving linear systems
- Sampled limited memory methods for massive linear inverse problems
- On data preconditioning for regularized loss minimization
- Randomized Kaczmarz Converges Along Small Singular Vectors
- On Adaptive Sketch-and-Project for Solving Linear Systems
- Quasi-Newton methods for machine learning: forget the past, just sample
- Randomized numerical linear algebra: Foundations and algorithms
- Max-affine regression via first-order methods
- Quantile-based iterative methods for corrupted systems of linear equations
- Stability of the Kaczmarz reconstruction for stationary sequences
- Improving sampling accuracy of stochastic gradient MCMC methods via non-uniform subsampling of gradients
- Stochastic gradient algorithm with random truncations
- Weighted SGD for _p regression with randomized preconditioning
- On stochastic Kaczmarz type methods for solving large scale systems of ill-posed equations
- Fit without fear: remarkable mathematical phenomena of deep learning through the prism of interpolation
- Stochastic accelerated alternating direction method of multipliers with importance sampling
- A subspace constrained randomized Kaczmarz method for structure or external knowledge exploitation
- Accelerating mini-batch SARAH by step size rules
- Randomized Kaczmarz method with adaptive stepsizes for inconsistent linear systems
- The dual Kaczmarz algorithm
- Momentum and stochastic momentum for stochastic gradient, Newton, proximal point and subspace descent methods
- Stochastic Steffensen method
- On the last iterate convergence of momentum methods
- Restricted strong convexity and its applications to convergence analysis of gradient-type methods in convex optimization
- An Optimal Scheduled Learning Rate for a Randomized Kaczmarz Algorithm
- Subgradient and sampling algorithms for _1 regression
- A semi-randomized Kaczmarz method with simple random sampling for large-scale linear systems
- Linear discriminant analysis with the randomized Kaczmarz method
- On the Kaczmarz methods based on relaxed greedy selection for solving matrix equation A X B = C
- Sampling Kaczmarz-Motzkin method for linear feasibility problems: generalization and acceleration
- The greedy randomized extended Kaczmarz algorithm for noisy linear systems
- Regularized Kaczmarz Algorithms for Tensor Recovery
- Greed Works: An Improved Analysis of Sampling Kaczmarz--Motzkin
- Randomized Kaczmarz algorithm with averaging and block projection
- Multilevel stochastic gradient methods for nested composition optimization
- Randomized iterative methods for tensor regression under the t-product
- Data-driven algorithm selection and tuning in optimization and signal processing
- Two stochastic optimization algorithms for convex optimization with fixed point constraints
- The Kaczmarz algorithm, row action methods, and statistical learning algorithms
- On the linear convergence of the stochastic gradient method with constant step-size
- A randomized algorithm for multivariate function approximation
- Importance sampling in signal processing applications
- Approximate Solutions of Linear Systems at a Universal Rate
- Weighted SGD for \(\ell_p\) regression with randomized preconditioning
- Quantile-RK and double quantile-RK error horizon analysis
- The quaternion relaxed greedy randomized Kaczmarz method with adaptive parameters for solving quaternion matrix equation
- On block Gaussian sketching for the Kaczmarz method
- Kernel-algorithms in frame-approximations
- Experience selection in deep reinforcement learning for control
- Deterministic and randomized Kaczmarz methods for AXB=C with applications to color image restoration
- On stochastic block methods for solving nonlinear equations
- Kalman-based stochastic gradient method with stop condition and insensitivity to conditioning
- Nonlinear Kaczmarz algorithms and their convergence
- Improved asynchronous parallel optimization analysis for stochastic incremental methods
- Stochastic modified equations and dynamics of stochastic gradient algorithms. I: Mathematical foundations
- A multivariate adaptive gradient algorithm with reduced tuning efforts
- Bridging the gap between constant step size stochastic gradient descent and Markov chains
- Homogenization of SGD in high-dimensions: exact dynamics and generalization properties
- Worst-case recovery guarantees for least squares approximation using random samples
- A Kaczmarz algorithm for sequences of projections, infinite products, and applications to frames in IFS \(L^2\) spaces
- Iterative Methods for Solving Factorized Linear Systems
- Randomly sparsified Richardson iteration: a dimension-independent sparse linear solver
- Randomized Kaczmarz methods for t-product tensor linear systems with factorized operators
- Stochastic Markov gradient descent and training low-bit neural networks
- Auxiliary Gradient-Based Sampling Algorithms
- Stochastic primal-dual coordinate method for regularized empirical risk minimization
This page was built for publication: Stochastic gradient descent, weighted sampling, and the randomized Kaczmarz algorithm
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5962728)