Trust-region algorithms for training responses: machine learning methods using indefinite Hessian approximations
From MaRDI portal
Publication:5113710
Abstract: Machine learning (ML) problems are often posed as highly nonlinear and nonconvex unconstrained optimization problems. Methods for solving ML problems based on stochastic gradient descent are easily scaled for very large problems but may involve fine-tuning many hyper-parameters. Quasi-Newton approaches based on the limited-memory Broyden-Fletcher-Goldfarb-Shanno (BFGS) update typically do not require manually tuning hyper-parameters but suffer from approximating a potentially indefinite Hessian with a positive-definite matrix. Hessian-free methods leverage the ability to perform Hessian-vector multiplication without needing the entire Hessian matrix, but each iteration's complexity is significantly greater than quasi-Newton methods. In this paper we propose an alternative approach for solving ML problems based on a quasi-Newton trust-region framework for solving large-scale optimization problems that allow for indefinite Hessian approximations. Numerical experiments on a standard testing data set show that with a fixed computational time budget, the proposed methods achieve better results than the traditional limited-memory BFGS and the Hessian-free methods.
Recommendations
- Quasi-Newton methods for machine learning: forget the past, just sample
- On the use of stochastic Hessian information in optimization methods for machine learning
- A robust multi-batch L-BFGS method for machine learning
- Optimization methods for large-scale machine learning
- A stochastic quasi-Newton method for large-scale optimization
Cites work
- scientific article; zbMATH DE number 3984475 (Why is no real title available?)
- scientific article; zbMATH DE number 1186893 (Why is no real title available?)
- scientific article; zbMATH DE number 3725604 (Why is no real title available?)
- scientific article; zbMATH DE number 6276119 (Why is no real title available?)
- scientific article; zbMATH DE number 5060482 (Why is no real title available?)
- A Stochastic Approximation Method
- A Subspace Minimization Method for the Trust-Region Step
- A new matrix-free algorithm for the large-scale trust-region subproblem
- A quasi-Newton algorithm for nonconvex, nonsmooth optimization with global convergence guarantees
- A robust multi-batch L-BFGS method for machine learning
- A stochastic quasi-Newton method for large-scale optimization
- Adaptive subgradient methods for online learning and stochastic optimization
- Algorithm 873
- Computing Optimal Locally Constrained Steps
- Computing a Trust Region Step
- Convergence of quasi-Newton matrices generated by the symmetric rank one update
- Iterative methods for finding a trust-region step
- Minimizing a quadratic over a sphere
- On efficiently computing the eigenvalues of limited-memory quasi-Newton matrices
- On solving L-SR1 trust-region subproblems
- On the limited memory BFGS method for large scale optimization
- Optimization methods for large-scale machine learning
- Representations of quasi-Newton matrices and their use in limited memory methods
- Sample size selection in optimization methods for machine learning
- Solving the Trust-Region Subproblem using the Lanczos Method
- Testing a Class of Methods for Solving Minimization Problems with Simple Bounds on the Variables
- The Conjugate Gradient Method and Trust Regions in Large Scale Optimization
- The elements of statistical learning. Data mining, inference, and prediction
- Trust Region Methods
- Updating Quasi-Newton Matrices with Limited Storage
Cited in
(4)- Second-order design sensitivity analysis using diagonal hyper-dual numbers
- A non-monotone trust-region method with noisy oracles and additional sampling
- Globally Convergent Multilevel Training of Deep Residual Networks
- A limited-memory trust-region method for nonlinear optimization with many equality constraints
Describes a project that uses
Uses Software
This page was built for publication: Trust-region algorithms for training responses: machine learning methods using indefinite Hessian approximations
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5113710)