An Asynchronous Mini-Batch Algorithm for Regularized Stochastic Optimization
From MaRDI portal
Publication:2979326
DOI10.1109/TAC.2016.2525015zbMATH Open1359.90080arXiv1505.04824OpenAlexW2949585412MaRDI QIDQ2979326FDOQ2979326
Authors: Hamid Reza Feyzmahdavian, Arda Aytekin, Mikael Johansson
Publication date: 3 May 2017
Published in: IEEE Transactions on Automatic Control (Search for Journal in Brave)
Abstract: Mini-batch optimization has proven to be a powerful paradigm for large-scale learning. However, the state of the art parallel mini-batch algorithms assume synchronous operation or cyclic update orders. When worker nodes are heterogeneous (due to different computational capabilities or different communication delays), synchronous and cyclic operations are inefficient since they will leave workers idle waiting for the slower nodes to complete their computations. In this paper, we propose an asynchronous mini-batch algorithm for regularized stochastic optimization problems with smooth loss functions that eliminates idle waiting and allows workers to run at their maximal update rates. We show that by suitably choosing the step-size values, the algorithm achieves a rate of the order for general convex regularization functions, and the rate for strongly convex regularization functions, where is the number of iterations. In both cases, the impact of asynchrony on the convergence rate of our algorithm is asymptotically negligible, and a near-linear speedup in the number of workers can be expected. Theoretical results are confirmed in real implementations on a distributed computing infrastructure.
Full work available at URL: https://arxiv.org/abs/1505.04824
Recommendations
- An Asynchronous Parallel Stochastic Coordinate Descent Algorithm
- Asynchronous stochastic coordinate descent: parallelism and convergence properties
- Parallel and distributed asynchronous adaptive stochastic gradient methods
- Mini-batch stochastic approximation methods for nonconvex stochastic composite optimization
- Distributed asynchronous deterministic and stochastic gradient optimization algorithms
- Stochastic modified equations for the asynchronous stochastic gradient descent
- A mini-batch stochastic conjugate gradient algorithm with variance reduction
- Parallelizing stochastic gradient descent for least squares regression: mini-batching, averaging, and model misspecification
- On the parallelization upper bound for asynchronous stochastic gradients descent in non-convex optimization
- Asynchronous Distributed Optimization Via Randomized Dual Proximal Gradient
Applications of mathematical programming (90C90) Stochastic programming (90C15) Parallel algorithms in computer science (68W10)
Cited In (21)
- Distributed Learning with Sparse Communications by Identification
- Title not available (Why is that?)
- Privacy-preserving distributed projected one-point bandit online optimization over directed graphs
- Linear convergence of proximal incremental aggregated gradient method for nonconvex nonsmooth minimization problems
- A distributed flexible delay-tolerant proximal gradient algorithm
- Robust asynchronous stochastic gradient-push: asymptotically optimal and network-independent performance for strongly convex functions
- Parallel and distributed asynchronous adaptive stochastic gradient methods
- Random gradient extrapolation for distributed and stochastic optimization
- Perturbed iterate analysis for asynchronous stochastic optimization
- Global convergence rate of proximal incremental aggregated gradient methods
- Proximal-like incremental aggregated gradient method with Bregman distance in weakly convex optimization problems
- Nonconvex proximal incremental aggregated gradient method with linear convergence
- Distributed stochastic optimization with large delays
- Federated dual averaging learning algorithm with delayed gradients for composite optimization
- Primal-dual algorithms for multi-agent structured optimization over message-passing architectures with bounded communication delays
- Convergence of an asynchronous block-coordinate forward-backward algorithm for convex composite optimization
- Distributed deterministic asynchronous algorithms in time-varying graphs through Dykstra splitting
- Inertial proximal incremental aggregated gradient method with linear convergence guarantees
- Asynchronous level bundle methods
- A distributed quantile estimation algorithm of heavy-tailed distribution with massive datasets
- Asynchronous variance-reduced block schemes for composite non-convex stochastic optimization: block-specific steplengths and adapted batch-sizes
This page was built for publication: An Asynchronous Mini-Batch Algorithm for Regularized Stochastic Optimization
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2979326)