Composite difference-MAX programs for modern statistical estimation problems
From MaRDI portal
Publication:4562249
Abstract: Many modern statistical estimation problems are defined by three major components: a statistical model that postulates the dependence of an output variable on the input features; a loss function measuring the error between the observed output and the model predicted output; and a regularizer that controls the overfitting and/or variable selection in the model. We study the sampling version of this generic statistical estimation problem where the model parameters are estimated by empirical risk minimization, which involves the minimization of the empirical average of the loss function at the data points weighted by the model regularizer. In our setup we allow all three component functions discussed above to be of the difference-of-convex (dc) type and illustrate them with a host of commonly used examples, including those in continuous piecewise affine regression and in deep learning (where the activation functions are piecewise affine). We describe a nonmonotone majorization-minimization (MM) algorithm for solving the unified nonconvex, nondifferentiable optimization problem which is formulated as a specially structured composite dc program of the pointwise max type, and present convergence results to a directional stationary solution. An efficient semismooth Newton method is proposed to solve the dual of the MM subproblems. Numerical results are presented to demonstrate the effectiveness of the proposed algorithm and the superiority of continuous piecewise affine regression over the standard linear model.
Recommendations
- On the pervasiveness of difference-convexity in optimization and statistics
- A difference of convex optimization algorithm for piecewise linear regression
- MM for penalized estimation
- Difference-of-convex learning: directional stationarity, optimality, and sparsity
- Regularized \(M\)-estimators with nonconvexity: statistical and algorithmic theory for local optima
Cites work
- scientific article; zbMATH DE number 4082855 (Why is no real title available?)
- scientific article; zbMATH DE number 44577 (Why is no real title available?)
- scientific article; zbMATH DE number 46303 (Why is no real title available?)
- scientific article; zbMATH DE number 3567782 (Why is no real title available?)
- scientific article; zbMATH DE number 3626409 (Why is no real title available?)
- scientific article; zbMATH DE number 6438182 (Why is no real title available?)
- A Computational Framework for Multivariate Convex Regression and Its Variants
- A Newton-CG augmented Lagrangian method for semidefinite programming
- A Quadratically Convergent Newton Method for Computing the Nearest Correlation Matrix
- A globally convergent Newton method for convex \(SC^ 1\) minimization problems
- A highly efficient semismooth Newton augmented Lagrangian method for solving lasso problems
- A nonsmooth version of Newton's method
- A proximal difference-of-convex algorithm with extrapolation
- A semismooth Newton method with multidimensional filter globalization for \(l_1\)-optimization
- An algorithm for the estimation of a regression function by continuous piecewise linear functions
- Automatic speech recognition. A deep learning approach
- Computing B-stationary points of nonsmooth DC programs
- Convergence analysis of difference-of-convex algorithm with subanalytic data
- Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward-backward splitting, and regularized Gauss-Seidel methods
- Convex analysis and monotone operator theory in Hilbert spaces
- DC approximation approaches for sparse optimization
- Difference-of-convex learning: directional stationarity, optimality, and sparsity
- EXTENSION OF NEWTON AND QUASI-NEWTON METHODS TO SYSTEMS OF PC^1 EQUATIONS
- Enhanced proximal DC algorithms with extrapolation for a class of structured nonsmooth DC minimization
- Enhancing sparsity by reweighted \(\ell _{1}\) minimization
- Finite-Dimensional Variational Inequalities and Complementarity Problems
- Incremental majorization-minimization optimization with application to large-scale machine learning
- Iterative Solution of Nonlinear Equations in Several Variables
- MM optimization algorithms
- Majorization-minimization procedures and convergence of SQP methods for semi-algebraic and tame programs
- Minimization of \(\ell_{1-2}\) for compressed sensing
- Multivariate convex regression with adaptive partitioning
- Nearly unbiased variable selection under minimax concave penalty
- Newton's Method for B-Differentiable Equations
- Nonlinear programming
- On functions representable as a difference of convex functions
- On the convergence of the proximal algorithm for nonsmooth functions involving analytic features
- On the convergence properties of the EM algorithm
- On the structure of convex piecewise quadratic functions
- Proximal alternating linearized minimization for nonconvex and nonsmooth problems
- Regression Quantiles
- Robust multicategory support vector machines using difference convex algorithm
- SDPNAL+: a majorized semismooth Newton-CG augmented Lagrangian method for semidefinite programming with nonnegative constraints
- Semismooth and Semiconvex Functions in Constrained Optimization
- Sparsity and Smoothness Via the Fused Lasso
- Structural properties of affine sparsity constraints
- Support-vector networks
- The DC (Difference of convex functions) programming and DCA revisited with DC models of real world nonconvex optimization problems
- The Łojasiewicz Inequality for Nonsmooth Subanalytic Functions with Applications to Subgradient Dynamical Systems
- Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties
- Variational Analysis
Cited in
(33)- A fast and effective algorithm for sparse linear regression with \(\ell_p\)-norm data fidelity and elastic net regularization
- Markov chain stochastic DCA and applications in deep learning with PDEs regularization
- On the pervasiveness of difference-convexity in optimization and statistics
- On the superiority of PGMs to PDCAs in nonsmooth nonconvex sparse regression
- Max-affine regression via first-order methods
- Spectrahedral Regression
- Solving Nonsmooth and Nonconvex Compound Stochastic Programs with Applications to Risk Measure Minimization
- Asymptotic Properties of Stationary Solutions of Coupled Nonconvex Nonsmooth Empirical Risk Minimization
- Nonconvex and nonsmooth approaches for affine chance-constrained stochastic programs
- Approximations of semicontinuous functions with applications to stochastic optimization and statistical estimation
- Multicomposite nonconvex optimization for training deep neural networks
- Nonconvex robust programming via value-function optimization
- A global two-stage algorithm for non-convex penalized high-dimensional linear regression problems
- An augmented Lagrangian method with constraint generation for shape-constrained convex regression problems
- Exact guarantees on the absence of spurious local minima for non-negative rank-1 robust principal component analysis
- A study of piecewise linear-quadratic programs
- High-order optimization methods for fully composite problems
- scientific article; zbMATH DE number 7306909 (Why is no real title available?)
- Hybrid Algorithms for Finding a D-Stationary Point of a Class of Structured Nonsmooth DC Minimization
- An inexact proximal majorization-minimization algorithm for remote sensing image stripe noise removal
- Optimality conditions for locally Lipschitz optimization with \(l_0\)-regularization
- Estimation of Knots in Linear Spline Models
- Proximal distance algorithms: theory and practice
- Frank-Wolfe-type methods for a class of nonconvex inequality-constrained problems
- Penalty and augmented Lagrangian methods for constrained DC programming
- Stochastic difference-of-convex-functions algorithms for nonconvex programming
- Manifold sampling for optimizing nonsmooth nonconvex compositions
- Estimation of individualized decision rules based on an optimized covariate-dependent equivalent of random outcomes
- Consistent approximations in composite optimization
- Stability and error analysis for optimization and generalized equations
- Strong metric (sub)regularity of Karush-Kuhn-Tucker mappings for piecewise linear-quadratic convex-composite optimization and the quadratic convergence of Newton's method
- An efficient semismooth Newton method for adaptive sparse signal recovery problems
- A study of convex convex-composite functions via infimal convolution with applications
This page was built for publication: Composite difference-MAX programs for modern statistical estimation problems
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q4562249)