Anisotropic Proximal Gradient

Authors Emanuel Laude, Panagiotis Patrinos

Abstract: This paper studies a novel algorithm for nonconvex composite minimization which can be interpreted in terms of dual space nonlinear preconditioning for the classical proximal gradient method. The proposed scheme can be applied to composite minimization problems whose smooth part exhibits an anisotropic descent inequality relative to a reference function. In the convex case this is a dual characterization of relative strong convexity in the Bregman sense. It is proved that the anisotropic descent property is closed under pointwise average if the dual Bregman distance is jointly convex and, more specifically, closed under pointwise conic combinations for the KL-divergence. We analyze the method's asymptotic convergence and prove its linear convergence under an anisotropic proximal gradient dominance condition. This is implied by anisotropic strong convexity, a recent dual characterization of relative smoothness in the Bregman sense. Applications are discussed including exponentially regularized LPs and logistic regression with nonsmooth regularization. In the LP case the method can be specialized to the Sinkhorn algorithm for regularized optimal transport and a classical parallel update algorithm for AdaBoost. Complementary to their existing primal interpretations in terms of entropic subspace projections this provides a new dual interpretation in terms of forward-backward splitting with entropic preconditioning.

This page was built for publication: Anisotropic Proximal Gradient

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6506901)