Perturbed iterate analysis for asynchronous stochastic optimization
From MaRDI portal
Publication:4588862
Abstract: We introduce and analyze stochastic optimization methods where the input to each gradient update is perturbed by bounded noise. We show that this framework forms the basis of a unified approach to analyze asynchronous implementations of stochastic optimization algorithms.In this framework, asynchronous stochastic optimization algorithms can be thought of as serial methods operating on noisy inputs. Using our perturbed iterate framework, we provide new analyses of the Hogwild! algorithm and asynchronous stochastic coordinate descent, that are simpler than earlier analyses, remove many assumptions of previous models, and in some cases yield improved upper bounds on the convergence rates. We proceed to apply our framework to develop and analyze KroMagnon: a novel, parallel, sparse stochastic variance-reduced gradient (SVRG) algorithm. We demonstrate experimentally on a 16-core machine that the sparse and parallel version of SVRG is in some cases more than four orders of magnitude faster than the standard SVRG algorithm.
Recommendations
- Improved asynchronous parallel optimization analysis for stochastic incremental methods
- Asynchronous stochastic coordinate descent: parallelism and convergence properties
- Distributed asynchronous deterministic and stochastic gradient optimization algorithms
- An Asynchronous Parallel Stochastic Coordinate Descent Algorithm
- Fully asynchronous stochastic coordinate descent: a tight lower bound on the parallelism achieving linear speedup
Cites work
- scientific article; zbMATH DE number 51132 (Why is no real title available?)
- ARock: an algorithmic framework for asynchronous parallel coordinate updates
- An Asynchronous Mini-Batch Algorithm for Regularized Stochastic Optimization
- An Asynchronous Parallel Stochastic Coordinate Descent Algorithm
- Asynchronous stochastic coordinate descent: parallelism and convergence properties
- Beyond the regret minimization barrier: optimal algorithms for stochastic strongly-convex optimization
- Chaotic relaxation
- Convex optimization: algorithms and complexity
- Distributed asynchronous deterministic and stochastic gradient optimization algorithms
- Parallel coordinate descent methods for big data optimization
- Revisiting Asynchronous Linear Solvers
Cited in
(15)- On the convergence analysis of aggregated heavy-ball method
- scientific article; zbMATH DE number 7307490 (Why is no real title available?)
- Parallel stochastic asynchronous coordinate descent: tight bounds on the possible parallelism
- Fully asynchronous stochastic coordinate descent: a tight lower bound on the parallelism achieving linear speedup
- Parallel and distributed asynchronous adaptive stochastic gradient methods
- Asynchronous parallel algorithms for nonconvex optimization
- A robust multi-batch L-BFGS method for machine learning
- Nonlinear Gradient Mappings and Stochastic Optimization: A General Framework with Applications to Heavy-Tail Noise
- scientific article; zbMATH DE number 7307474 (Why is no real title available?)
- Applications of Fokker Planck equations in machine learning algorithms
- Incremental without replacement sampling in nonconvex optimization
- Improved asynchronous parallel optimization analysis for stochastic incremental methods
- Distributed stochastic optimization with large delays
- On the convergence analysis of asynchronous SGD for solving consistent linear systems
- Delay and cooperation in nonstochastic bandits
This page was built for publication: Perturbed iterate analysis for asynchronous stochastic optimization
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q4588862)