An accelerated proximal algorithm for regularized nonconvex and nonsmooth bi-level optimization
From MaRDI portal
Publication:6161202
Abstract: Many important machine learning applications involve regularized nonconvex bi-level optimization. However, the existing gradient-based bi-level optimization algorithms cannot handle nonconvex or nonsmooth regularizers, and they suffer from a high computation complexity in nonconvex bi-level optimization. In this work, we study a proximal gradient-type algorithm that adopts the approximate implicit differentiation (AID) scheme for nonconvex bi-level optimization with possibly nonconvex and nonsmooth regularizers. In particular, the algorithm applies the Nesterov's momentum to accelerate the computation of the implicit gradient involved in AID. We provide a comprehensive analysis of the global convergence properties of this algorithm through identifying its intrinsic potential function. In particular, we formally establish the convergence of the model parameters to a critical point of the bi-level problem, and obtain an improved computation complexity over the state-of-the-art result. Moreover, we analyze the asymptotic convergence rates of this algorithm under a class of local nonconvex geometries characterized by a {L}ojasiewicz-type gradient inequality. Experiment on hyper-parameter optimization demonstrates the effectiveness of our algorithm.
Recommendations
- Convex Bi-level Optimization Problems with Nonsmooth Outer Objective Function
- Methodology and first-order algorithms for solving nonsmooth and non-strongly convex bilevel optimization problems
- Accelerated fixed point algorithm for convex bi-level optimization problems in Hilbert spaces with applications
- A gentle introduction to algorithms for bilevel optimization from machine learning
Cites work
- scientific article; zbMATH DE number 3371284 (Why is no real title available?)
- A Two-Timescale Stochastic Algorithm Framework for Bilevel Optimization: Complexity Analysis and Application to Actor-Critic
- An extended Kuhn-Tucker approach for linear bilevel programming
- Distributed Proximal Gradient Algorithm for Partially Asynchronous Computer Clusters
- Introductory lectures on convex optimization. A basic course.
- Mathematical Programs with Optimization Problems in the Constraints
- New Branch-and-Bound Rules for Linear Bilevel Programming
- On the convergence of the proximal algorithm for nonsmooth functions involving analytic features
- OnActor-Critic Algorithms
- Proximal alternating linearized minimization for nonconvex and nonsmooth problems
- Splitting Algorithms for the Sum of Two Nonlinear Operators
- Splitting methods with variable metric for Kurdyka-Łojasiewicz functions and general convergence rates
- The Łojasiewicz Inequality for Nonsmooth Subanalytic Functions with Applications to Subgradient Dynamical Systems
Cited in
(5)- Alternating proximal algorithm for the problem of bi-level convex minimization
- Accelerated fixed point algorithm for convex bi-level optimization problems in Hilbert spaces with applications
- Analyzing inexact hypergradients for bilevel learning
- A gentle introduction to algorithms for bilevel optimization from machine learning
- Convex Bi-level Optimization Problems with Nonsmooth Outer Objective Function
This page was built for publication: An accelerated proximal algorithm for regularized nonconvex and nonsmooth bi-level optimization
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6161202)