The error-feedback framework: SGD with delayed gradients
From MaRDI portal
Publication:5149264
Recommendations
- Distributed stochastic optimization with large delays
- A sharp convergence rate for a model equation of the asynchronous stochastic gradient descent
- A distributed flexible delay-tolerant proximal gradient algorithm
- Distributed stochastic inertial-accelerated methods with delayed derivatives for nonconvex problems
- Stochastic gradient descent with Polyak's learning rate
Cites work
- scientific article; zbMATH DE number 3790208 (Why is no real title available?)
- scientific article; zbMATH DE number 51132 (Why is no real title available?)
- A Stochastic Approximation Method
- An Asynchronous Mini-Batch Algorithm for Regularized Stochastic Optimization
- Communication-efficient algorithms for statistical optimization
- Cubic regularization of Newton method and its global performance
- Gradient descent learns linear dynamical systems
- Harder, Better, Faster, Stronger Convergence Rates for Least-Squares Regression
- Improved asynchronous parallel optimization analysis for stochastic incremental methods
- Introductory lectures on convex optimization. A basic course.
- Large-scale machine learning with stochastic gradient descent
- Linear convergence of first order methods for non-strongly convex optimization
- New method of stochastic approximation type
- Optimal Stochastic Approximation Algorithms for Strongly Convex Stochastic Composite Optimization I: A Generic Algorithmic Framework
- Optimal distributed online prediction using mini-batches
- Optimization methods for large-scale machine learning
- Parallelizing stochastic gradient descent for least squares regression: mini-batching, averaging, and model misspecification
- Perturbed iterate analysis for asynchronous stochastic optimization
- Robust Stochastic Approximation Approach to Stochastic Programming
- Stochastic First- and Zeroth-Order Methods for Nonconvex Stochastic Programming
- Stochastic gradient descent, weighted sampling, and the randomized Kaczmarz algorithm
Cited in
(4)- The Error-Feedback Framework: Better Rates for SGD with Delayed Gradients and Compressed Communication
- Faster Rates for Compressed Federated Learning with Client-Variance Reduction
- Efficient and reliable overlay networks for decentralized federated learning
- Non-ergodic linear convergence property of the delayed gradient descent under the strongly convexity and the Polyak-Łojasiewicz condition
This page was built for publication: The error-feedback framework: SGD with delayed gradients
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5149264)