Discriminative Bayesian filtering lends momentum to the stochastic Newton method for minimizing log-convex functions
From MaRDI portal
Publication:6366198
DOI10.1007/S11590-022-01895-5arXiv2104.12949MaRDI QIDQ6366198FDOQ6366198
Authors: Michael C. Burkhart
Publication date: 26 April 2021
Abstract: To minimize the average of a set of log-convex functions, the stochastic Newton method iteratively updates its estimate using subsampled versions of the full objective's gradient and Hessian. We contextualize this optimization problem as sequential Bayesian inference on a latent state-space model with a discriminatively-specified observation process. Applying Bayesian filtering then yields a novel optimization algorithm that considers the entire history of gradients and Hessians when forming an update. We establish matrix-based conditions under which the effect of older observations diminishes over time, in a manner analogous to Polyak's heavy ball momentum. We illustrate various aspects of our approach with an example and review other relevant innovations for the stochastic Newton method.
Convex programming (90C25) Inference from stochastic processes and prediction (62M20) Stochastic programming (90C15) Newton-type methods (49M15)
This page was built for publication: Discriminative Bayesian filtering lends momentum to the stochastic Newton method for minimizing log-convex functions
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6366198)