Batched Stochastic Gradient Descent with Weighted Sampling
From MaRDI portal
Publication:4609808
DOI10.1007/978-3-319-59912-0_14zbMATH Open1385.65041arXiv1608.07641OpenAlexW2963244042MaRDI QIDQ4609808FDOQ4609808
Authors: D. Needell, Rachel Ward
Publication date: 26 March 2018
Published in: Springer Proceedings in Mathematics & Statistics (Search for Journal in Brave)
Abstract: We analyze a batched variant of Stochastic Gradient Descent (SGD) with weighted sampling distribution for smooth and non-smooth objective functions. We show that by distributing the batches computationally, a significant speedup in the convergence rate is provably possible compared to either batched sampling or weighted sampling alone. We propose several computationally efficient schemes to approximate the optimal weights, and compute proposed sampling distributions explicitly for the least squares and hinge loss problems. We show both analytically and experimentally that substantial gains can be obtained.
Full work available at URL: https://arxiv.org/abs/1608.07641
Recommendations
- Stochastic gradient descent, weighted sampling, and the randomized Kaczmarz algorithm
- Adaptive sampling for incremental optimization using stochastic gradient descent
- A stochastic multiple gradient descent algorithm
- Non-asymptotic guarantees for sampling by stochastic gradient descent
- Weighted SGD for \(\ell_p\) regression with randomized preconditioning
- Weighted SGD for \(\ell_p\) regression with randomized preconditioning
- A stochastic version of Stein variational gradient descent for efficient sampling
- scientific article; zbMATH DE number 6860839
- Combining resampling and reweighting for faithful stochastic optimization
- Distributed stochastic variance reduced gradient methods by sampling extra data with replacement
Numerical mathematical programming methods (65K05) Quadratic programming (90C20) Stochastic programming (90C15)
Cites Work
- Regularization tools version \(4.0\) for matlab \(7.3\)
- Pegasos: primal estimated sub-gradient solver for SVM
- A randomized Kaczmarz algorithm with exponential convergence
- Introductory lectures on convex optimization. A basic course.
- A Stochastic Approximation Method
- Decoding by Linear Programming
- Robust Stochastic Approximation Approach to Stochastic Programming
- Sample size selection in optimization methods for machine learning
- Efficiency of coordinate descent methods on huge-scale optimization problems
- On optimal probabilities in stochastic coordinate descent methods
- Weighted SGD for \(\ell_p\) regression with randomized preconditioning
- A proximal stochastic gradient method with progressive variance reduction
- Stochastic gradient descent, weighted sampling, and the randomized Kaczmarz algorithm
- Randomized Kaczmarz solver for noisy linear systems
- Large-scale machine learning with stochastic gradient descent
- Optimal distributed online prediction using mini-batches
- Two-subspace projection method for coherent overdetermined systems
- Minimizing finite sums with the stochastic average gradient
- Title not available (Why is that?)
- Randomized quasi-Newton updates are linearly convergent matrix inversion algorithms
- Title not available (Why is that?)
Cited In (11)
- Randomized Kaczmarz with averaging
- Iterative singular tube hard thresholding algorithms for tensor recovery
- Stochastic gradient descent, weighted sampling, and the randomized Kaczmarz algorithm
- Title not available (Why is that?)
- Stochastic greedy algorithms for multiple measurement vectors
- Randomized Kaczmarz algorithm with averaging and block projection
- A selective review on statistical methods for massive data computation: distributed computing, subsampling, and minibatch techniques
- Optimized convergence of stochastic gradient descent by weighted averaging
- A multivariate adaptive gradient algorithm with reduced tuning efforts
- Parallelizing stochastic gradient descent for least squares regression: mini-batching, averaging, and model misspecification
- A block-randomized stochastic method with importance sampling for CP tensor decomposition
Uses Software
This page was built for publication: Batched Stochastic Gradient Descent with Weighted Sampling
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q4609808)