Minimizing finite sums with the stochastic average gradient

From MaRDI portal
Publication:517295

DOI10.1007/s10107-016-1030-6zbMath1358.90073arXiv1309.2388OpenAlexW2963156201MaRDI QIDQ517295

Nicolas Le Roux, Mark Schmidt, Francis Bach

Publication date: 23 March 2017

Published in: Mathematical Programming. Series A. Series B (Search for Journal in Brave)

Full work available at URL: https://arxiv.org/abs/1309.2388



Related Items

Coupled Generation, On inexact stochastic splitting methods for a class of nonconvex composite optimization problems with relative error, Accelerated and Instance-Optimal Policy Evaluation with Linear Function Approximation, Convergence analysis of a subsampled Levenberg-Marquardt algorithm, An aggressive reduction on the complexity of optimization for non-strongly convex objectives, An event-triggering algorithm for decentralized stochastic optimization over networks, A stochastic averaging gradient algorithm with multi‐step communication for distributed optimization, Recursive ridge regression using second-order stochastic algorithms, Unified analysis of stochastic gradient methods for composite convex and smooth optimization, A stochastic variance reduced gradient using Barzilai-Borwein techniques as second order information, SVRG meets AdaGrad: painless variance reduction, A mini-batch proximal stochastic recursive gradient algorithm with diagonal Barzilai-Borwein stepsize, A framework of convergence analysis of mini-batch stochastic projected gradient methods, On a general structure for adaptation/learning algorithms. -- Stability and performance issues, A modified stochastic quasi-Newton algorithm for summing functions problem in machine learning, Block mirror stochastic gradient method for stochastic optimization, A line search based proximal stochastic gradient algorithm with dynamical variance reduction, Improving sampling accuracy of stochastic gradient MCMC methods via non-uniform subsampling of gradients, Accelerating stochastic sequential quadratic programming for equality constrained optimization using predictive variance reduction, Adaptive proximal SGD based on new estimating sequences for sparser ERM, Open issues and recent advances in DC programming and DCA, Random-reshuffled SARAH does not need full gradient computations, Recent theoretical advances in decentralized distributed convex optimization, Recent Theoretical Advances in Non-Convex Optimization, Bregman Finito/MISO for Nonconvex Regularized Finite Sum Minimization without Lipschitz Gradient Continuity, Some Limit Properties of Markov Chains Induced by Recursive Stochastic Algorithms, Stochastic accelerated alternating direction method of multipliers with importance sampling, Block-coordinate and incremental aggregated proximal gradient methods for nonsmooth nonconvex problems, An Accelerated Randomized Proximal Coordinate Gradient Method and its Application to Regularized Empirical Risk Minimization, GADMM: Fast and Communication Efficient Framework for Distributed Machine Learning, General framework for binary classification on top samples, Quasi-Newton methods for machine learning: forget the past, just sample, A stochastic extra-step quasi-Newton method for nonsmooth nonconvex optimization, Finite-sum smooth optimization with SARAH, Stochastic Learning Approach for Binary Optimization: Application to Bayesian Optimal Design of Experiments, Complexity Analysis of stochastic gradient methods for PDE-constrained optimal Control Problems with uncertain parameters, Adaptive Sampling Strategies for Stochastic Optimization, Unnamed Item, Unnamed Item, Unnamed Item, Unnamed Item, Minimizing robust estimates of sums of parameterized functions, Sketched Newton--Raphson, Accelerating incremental gradient optimization with curvature information, Optimizing Adaptive Importance Sampling by Stochastic Approximation, Improving kernel online learning with a snapshot memory, Cocoercivity, smoothness and bias in variance-reduced stochastic gradient methods, Acceleration on Adaptive Importance Sampling with Sample Average Approximation, Convergence rates of accelerated proximal gradient algorithms under independent noise, The multiproximal linearization method for convex composite problems, On Stochastic and Deterministic Quasi-Newton Methods for Nonstrongly Convex Optimization: Asymptotic Convergence and Rate Analysis, Linear convergence of cyclic SAGA, Generalized forward-backward splitting with penalization for monotone inclusion problems, Unnamed Item, Unnamed Item, An accelerated variance reducing stochastic method with Douglas-Rachford splitting, Inexact proximal stochastic gradient method for convex composite optimization, Stochastic Reformulations of Linear Systems: Algorithms and Convergence Theory, Incremental Majorization-Minimization Optimization with Application to Large-Scale Machine Learning, Batched Stochastic Gradient Descent with Weighted Sampling, A Continuous-Time Analysis of Distributed Stochastic Gradient, Bi-fidelity stochastic gradient descent for structural optimization under uncertainty, Multilevel Stochastic Gradient Methods for Nested Composition Optimization, A linearly convergent stochastic recursive gradient method for convex optimization, Stochastic Conditional Gradient++: (Non)Convex Minimization and Continuous Submodular Maximization, On variance reduction for stochastic smooth convex optimization with multiplicative noise, A Smooth Inexact Penalty Reformulation of Convex Problems with Linear Constraints, Surpassing Gradient Descent Provably: A Cyclic Incremental Method with Linear Convergence Rate, Optimization Methods for Large-Scale Machine Learning, Leveraged least trimmed absolute deviations, Unnamed Item, Unnamed Item, An Optimal Algorithm for Decentralized Finite-Sum Optimization, Catalyst Acceleration for First-order Convex Optimization: from Theory to Practice, IQN: An Incremental Quasi-Newton Method with Local Superlinear Convergence Rate, Convergence of stochastic proximal gradient algorithm, Point process estimation with Mirror Prox algorithms, Generalized stochastic Frank-Wolfe algorithm with stochastic ``substitute gradient for structured convex optimization, Analysis of biased stochastic gradient descent using sequential semidefinite programs, Momentum and stochastic momentum for stochastic gradient, Newton, proximal point and subspace descent methods, Convergence rates for optimised adaptive importance samplers, The Averaged Kaczmarz Iteration for Solving Inverse Problems, Random Gradient Extrapolation for Distributed and Stochastic Optimization, Stochastic Primal-Dual Hybrid Gradient Algorithm with Arbitrary Sampling and Imaging Applications, Ensemble Kalman inversion: a derivative-free technique for machine learning tasks, Stochastic proximal quasi-Newton methods for non-convex composite optimization, Randomized smoothing variance reduction method for large-scale non-smooth convex optimization, Stochastic quasi-gradient methods: variance reduction via Jacobian sketching, Relative utility bounds for empirically optimal portfolios, Multivariate goodness-of-fit tests based on Wasserstein distance, Fast and safe: accelerated gradient methods with optimality certificates and underestimate sequences, Unnamed Item, A stochastic primal-dual method for optimization with conditional value at risk constraints, A stochastic trust region method for unconstrained optimization problems, Provable accelerated gradient method for nonconvex low rank optimization, On the regularization effect of stochastic gradient descent applied to least-squares, Stochastic DCA for minimizing a large sum of DC functions with application to multi-class logistic regression, Analysis of stochastic gradient descent in continuous time, Fast incremental expectation maximization for finite-sum optimization: nonasymptotic convergence, Fully asynchronous policy evaluation in distributed reinforcement learning over networks, The recursive variational Gaussian approximation (R-VGA), An Inexact Variable Metric Proximal Point Algorithm for Generic Quasi-Newton Acceleration, Adaptive Sampling for Incremental Optimization Using Stochastic Gradient Descent, Deep relaxation: partial differential equations for optimizing deep neural networks, Proximal-Like Incremental Aggregated Gradient Method with Linear Convergence Under Bregman Distance Growth Conditions, A randomized incremental primal-dual method for decentralized consensus optimization, A Stochastic Semismooth Newton Method for Nonsmooth Nonconvex Optimization, Unnamed Item, Unnamed Item, Incremental proximal gradient scheme with penalization for constrained composite convex optimization problems, Variable metric proximal stochastic variance reduced gradient methods for nonconvex nonsmooth optimization, Stochastic average gradient algorithm for multirate FIR models with varying time delays using self‐organizing maps, Multi-agent reinforcement learning: a selective overview of theories and algorithms, Unnamed Item, PDE-Constrained Optimal Control Problems with Uncertain Parameters using SAGA, Unnamed Item, Stochastic proximal linear method for structured non-convex problems, Inexact SARAH algorithm for stochastic optimization, A hierarchically low-rank optimal transport dissimilarity measure for structured data, Unnamed Item, Linear convergence of proximal incremental aggregated gradient method for nonconvex nonsmooth minimization problems, A stochastic first-order trust-region method with inexact restoration for finite-sum minimization, A Stochastic Proximal Alternating Minimization for Nonsmooth and Nonconvex Optimization, Accelerating variance-reduced stochastic gradient methods, A hybrid stochastic optimization framework for composite nonconvex optimization


Uses Software


Cites Work