On the Use of Stochastic Hessian Information in Optimization Methods for Machine Learning

DOI10.1137/10079923XzbMath1245.65062OpenAlexW1991083751MaRDI QIDQ3105787

Will Neveitt, Byrd, Richard H., Nocedal, Jorge, Gillian M. Chin

Publication date: 9 January 2012

Published in: SIAM Journal on Optimization (Search for Journal in Brave)

Full work available at URL: https://doi.org/10.1137/10079923x

zbMATH Keywords

unconstrained optimization stochastic optimization speech recognition machine learning Broyden-Fletcher-Goldfarb-Shanno method Newton conjugate gradient method Hessian subsampling matrix-free conjugate gradient iteration

Mathematics Subject Classification ID

Numerical mathematical programming methods (65K05) Large-scale problems in mathematical programming (90C06) Nonlinear programming (90C30) Learning and adaptive systems in artificial intelligence (68T05) Stochastic programming (90C15) Pattern recognition, speech recognition (68T10) Methods of successive quadratic programming type (90C55)

Related Items

On data preconditioning for regularized loss minimization, Clustering-based preconditioning for stochastic programs, Probabilistic learning inference of boundary value problem with uncertainties based on Kullback-Leibler divergence under implicit constraints, Quasi-Newton methods for machine learning: forget the past, just sample, Descent direction method with line search for unconstrained optimization in noisy environment, A stochastic extra-step quasi-Newton method for nonsmooth nonconvex optimization, Unnamed Item, A nonmonotone line search method for stochastic optimization problems, Predictive coarse-graining, SCORE: approximating curvature information under self-concordant regularization, Adaptive stochastic approximation algorithm, An overview of stochastic quasi-Newton methods for large-scale machine learning, Newton-MR: inexact Newton method with minimum residual sub-problem solver, Generalized linear models for massive data via doubly-sketching, Unnamed Item, Nonlinear Gradient Mappings and Stochastic Optimization: A General Framework with Applications to Heavy-Tail Noise, Hessian averaging in stochastic Newton methods achieves superlinear convergence, Newton Sketch: A Near Linear-Time Optimization Algorithm with Linear-Quadratic Convergence, Discriminative Bayesian filtering lends momentum to the stochastic Newton method for minimizing log-convex functions, Stable architectures for deep neural networks, Subsampled Hessian Newton Methods for Supervised Learning, Entropy-based closure for probabilistic learning on manifolds, Parallel Optimization Techniques for Machine Learning, Convergence of Newton-MR under Inexact Hessian Information, Sub-sampled Newton methods, Spectral projected gradient method for stochastic optimization, Fast Approximation of the Gauss--Newton Hessian Matrix for the Multilayer Perceptron, Probabilistic learning on manifolds constrained by nonlinear partial differential equations for small datasets, Optimization Methods for Large-Scale Machine Learning, Distributed Newton Methods for Deep Neural Networks, Design optimization under uncertainties of a mesoscale implant in biological tissues using a probabilistic learning algorithm, Stochastic Quasi-Newton Methods for Nonconvex Stochastic Optimization, Robust inversion, dimensionality reduction, and randomized sampling, Sample size selection in optimization methods for machine learning, Nonlinear optimization and support vector machines, Nonlinear optimization and support vector machines, A Stochastic Quasi-Newton Method for Large-Scale Optimization, Stochastic sub-sampled Newton method with variance reduction, Compact representations of structured BFGS matrices, A robust multi-batch L-BFGS method for machine learning, Parallel Simultaneous Perturbation Optimization, An Inertial Newton Algorithm for Deep Learning, Unnamed Item, A Stochastic Semismooth Newton Method for Nonsmooth Nonconvex Optimization, On the local convergence of a stochastic semismooth Newton method for nonsmooth nonconvex optimization, Linesearch Newton-CG methods for convex optimization with noise, Unnamed Item, Nonmonotone line search methods with variable sample size, Unnamed Item, LSOS: Line-search second-order stochastic optimization methods for nonconvex finite sums, Newton-like Method with Diagonal Correction for Distributed Optimization

Uses Software