Best subset selection via a modern optimization lens
From MaRDI portal
(Redirected from Publication:282479)
Abstract: In the last twenty-five years (1990-2014), algorithmic advances in integer optimization combined with hardware improvements have resulted in an astonishing 200 billion factor speedup in solving Mixed Integer Optimization (MIO) problems. We present a MIO approach for solving the classical best subset selection problem of choosing out of features in linear regression given observations. We develop a discrete extension of modern first order continuous optimization methods to find high quality feasible solutions that we use as warm starts to a MIO solver that finds provably optimal solutions. The resulting algorithm (a) provides a solution with a guarantee on its suboptimality even if we terminate the algorithm early, (b) can accommodate side constraints on the coefficients of the linear regression and (c) extends to finding best subset solutions for the least absolute deviation loss function. Using a wide variety of synthetic and real datasets, we demonstrate that our approach solves problems with in the 1000s and in the 100s in minutes to provable optimality, and finds near optimal solutions for in the 100s and in the 1000s in minutes. We also establish via numerical experiments that the MIO approach performs better than { exttt {Lasso}} and other popularly used sparse learning procedures, in terms of achieving sparse solutions with good predictive power.
Recommendations
- OR forum: An algorithmic approach to linear regression
- An efficient optimization approach for best subset selection in linear regression, with application to model selection and fitting in autoregressive time-series
- Fast best subset selection: coordinate descent and local combinatorial optimization algorithms
- A new approach to select the best subset of predictors in linear regression modelling: bi-objective mixed integer linear programming
- An effective procedure for feature subset selection in logistic regression based on information criteria
Cites work
- scientific article; zbMATH DE number 5957408 (Why is no real title available?)
- scientific article; zbMATH DE number 1906319 (Why is no real title available?)
- scientific article; zbMATH DE number 845714 (Why is no real title available?)
- A Statistical View of Some Chemometrics Regression Tools
- A brief history of linear and mixed-integer programming computation
- A general theory of concave regularization for high-dimensional sparse estimation problems
- A unified approach to model selection and sparse recovery using regularized least squares
- Aggregation for Gaussian regression
- Algorithm for cardinality-constrained quadratic optimization
- Analysis of multi-stage convex relaxation for sparse regularization
- Asymptotic Equivalence of Regularization Methods in Thresholded Parameter Space
- Asymptotics for Lasso-type estimators.
- Atomic Decomposition by Basis Pursuit
- Best subset selection via a modern optimization lens
- Best subset selection, persistence in high-dimensional statistical learning and optimization under l₁ constraint
- Certifying the Restricted Isometry Property is Hard
- Computational study of a family of mixed-integer quadratic programming problems
- Enhancing sparsity by reweighted \(\ell _{1}\) minimization
- For most large underdetermined systems of linear equations the minimal 𝓁1‐norm solution is also the sparsest solution
- Gradient methods for minimizing composite functions
- Graph puzzles, homotopy, and the alternating group
- High-dimensional graphs and variable selection with the Lasso
- Ideal spatial adaptation by wavelet shrinkage
- Introductory lectures on convex optimization. A basic course.
- Iterative hard thresholding for compressed sensing
- Iterative thresholding for sparse approximations
- Just relax: convex programming methods for identifying sparse signals in noise
- Least angle regression. (With discussion)
- Least quantile regression via modern optimization
- Minimax Rates of Estimation for High-Dimensional Linear Regression Over $\ell_q$-Balls
- Near-Optimal Signal Recovery From Random Projections: Universal Encoding Strategies?
- Near-ideal model selection by \(\ell _{1}\) minimization
- Nearly unbiased variable selection under minimax concave penalty
- Nonconcave Penalized Likelihood With NP-Dimensionality
- On constrained and regularized high-dimensional regression
- One-step sparse estimates in nonconcave penalized likelihood models
- Optimally sparse representation in general (nonorthogonal) dictionaries via ℓ 1 minimization
- Pathwise coordinate optimization
- Persistene in high-dimensional linear predictor-selection and the virtue of overparametrization
- Sharp Thresholds for High-Dimensional and Noisy Sparsity Recovery Using $\ell _{1}$-Constrained Quadratic Programming (Lasso)
- Simultaneous analysis of Lasso and Dantzig selector
- Smooth minimization of non-smooth functions
- Sparse Approximate Solutions to Linear Systems
- SparseNet: coordinate descent with nonconvex penalties
- Statistics for high-dimensional data. Methods, theory and applications.
- The Adaptive Lasso and Its Oracle Properties
- The adaptive and the thresholded Lasso for potentially misspecified models (and a lower bound for the Lasso)
- The restricted isometry property and its implications for compressed sensing
- The sparsity and bias of the LASSO selection in high-dimensional linear regression
- Uncertainty principles and ideal atomic decomposition
- Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties
Cited in
(only showing first 100 items - show all)- High-Dimensional Cost-constrained Regression Via Nonconvex Optimization
- A penalty decomposition approach for multi-objective cardinality-constrained optimization problems
- A Bayesian perspective of statistical machine learning for big data
- Factor-driven two-regime regression
- An attention algorithm for solving large scale structured \(l_0\)-norm penalty estimation problems
- Fitting sparse linear models under the sufficient and necessary condition for model identification
- Newton method for \(\ell_0\)-regularized optimization
- Non-marginal feature screening for additive hazard model with ultrahigh-dimensional covariates
- Optimal classification trees
- Sparse optimization via vector \(k\)-norm and DC programming with an application to feature selection for support vector machines
- Subset selection in network-linked data
- Estimation of \(l_0\) norm penalized models: a statistical treatment
- Optimal $k$-Thresholding Algorithms for Sparse Optimization Problems
- Partially egalitarian portfolio selection
- Fast best subset selection: coordinate descent and local combinatorial optimization algorithms
- Numerical characterization of support recovery in sparse regression with correlated design
- On the convergence of inexact alternate minimization in problems with \(\ell_0\) penalties
- A Mixed-Integer Fractional Optimization Approach to Best Subset Selection
- Constructing two-level \(Q_B\)-optimal screening designs using mixed-integer programming and heuristic algorithms
- Simplex QP-based methods for minimizing a conic quadratic objective over polyhedra
- Sparse high-dimensional regression: exact scalable algorithms and phase transitions
- Discussion of ``Best subset, forward stepwise or Lasso? Analysis and recommendations based on extensive comparisons
- Certifiably optimal sparse principal component analysis
- Sparse regression: scalable algorithms and empirical performance
- The EAS approach to variable selection for multivariate response data in high-dimensional settings
- Joint outlier detection and variable selection using discrete optimization
- Divide-and-conquer information-based optimal subdata selection algorithm
- On sparse ensemble methods: an application to short-term predictions of the evolution of COVID-19
- Simultaneous feature selection and outlier detection with optimality guarantees
- Adaptive iterative hard thresholding for least absolute deviation problems with sparsity constraints
- Mathematical programming for simultaneous feature selection and outlier detection under l1 norm
- Modern variable selection in action: comment on the papers by HTT and BPV
- Sparse HP filter: finding kinks in the COVID-19 contact rate
- Rejoinder: ``Sparse regression: scalable algorithms and empirical performance
- Cost-sensitive feature selection for support vector machines
- On support vector machines under a multiple-cost scenario
- A new approach to select the best subset of predictors in linear regression modelling: bi-objective mixed integer linear programming
- Gaussian Process Assisted Active Learning of Physical Laws
- Budget constrained model selection for multiple linear regression
- On computing medians of marked point process data under edit distance
- Bayesian \(l_0\)-regularized least squares
- Cost-based feature selection for support vector machines: an application in credit scoring
- An automated exact solution framework towards solving the logistic regression best subset selection problem
- Nearly Dimension-Independent Sparse Linear Bandit over Small Action Spaces via Best Subset Selection
- The trimmed Lasso: sparse recovery guarantees and practical optimization by the generalized soft-min penalty
- Best-subset model selection based on multitudinal assessments of likelihood improvements
- A unified approach to mixed-integer optimization problems with logical constraints
- On integer and MPCC representability of affine sparsity
- Lagrangian duality and saddle points for sparse linear programming
- Cardinality minimization, constraints, and regularization: a survey
- Combinatorial optimization. Abstracts from the workshop held November 7--13, 2021 (hybrid meeting)
- Techniques for accelerating branch-and-bound algorithms dedicated to sparse optimization
- DC formulations and algorithms for sparse optimization problems
- Scalable algorithms for the sparse ridge regression
- Deconfounding and Causal Regularisation for Stability and External Validity
- Semi-automated simultaneous predictor selection for regression-SARIMA models
- Sparse classification: a scalable discrete optimization perspective
- On the Solution of ℓ0-Constrained Sparse Inverse Covariance Estimation Problems
- A reformulation technique to solve polynomial optimization problems with separable objective functions of bounded integer variables
- Logistic regression: from art to science
- Leveraged least trimmed absolute deviations
- Minimization of Akaike's information criterion in linear regression analysis via mixed integer nonlinear program
- Global optimization for sparse solution of least squares problems
- Optimization for L1-Norm Error Fitting via Data Aggregation
- SuRF: A new method for sparse variable selection, with application in microbiome data analysis
- A sketch-and-select Arnoldi process
- A two-stage approach to the UCITS-constrained index-tracking problem
- Mixed integer nonlinear program for minimization of Akaike's information criterion
- Grouped variable selection with discrete optimization: computational and statistical perspectives
- scientific article; zbMATH DE number 7387624 (Why is no real title available?)
- Solving nonnegative sparsity-constrained optimization via DC quadratic-piecewise-linear approximations
- A graph-based decomposition method for convex quadratic optimization with indicators
- Feature subset selection for logistic regression via mixed integer optimization
- A look at robustness and stability of \(\ell_1\)-versus \(\ell_0\)-regularization: discussion of papers by Bertsimas et al. and Hastie et al.
- Piecewise Linear Function Fitting via Mixed-Integer Linear Programming
- Best subset binary prediction
- Sparse regression for data-driven deterrence functions in gravity models
- MIP-BOOST: Efficient and Effective L0 Feature Selection for Linear Regression
- Extending AIC to best subset regression
- scientific article; zbMATH DE number 7750672 (Why is no real title available?)
- Sparse regression over clusters: SparClur
- Locating hyperplanes to fitting set of points: a general framework
- L0-Regularized Learning for High-Dimensional Additive Hazards Regression
- scientific article; zbMATH DE number 7370569 (Why is no real title available?)
- Clustering, multicollinearity, and singular vectors
- Mathematical optimization for analyzing and forecasting nonlinear network time series
- Subspace Newton method for sparse group \(\ell_0\) optimization problem
- Exterior-point optimization for sparse and low-rank optimization
- Convex optimization for group feature selection in networked data
- Outlier detection in time series via mixed-integer conic quadratic optimization
- Tractable ADMM schemes for computing KKT points and local minimizers for \(\ell_0\)-minimization problems
- Convergent inexact penalty decomposition methods for cardinality-constrained problems
- scientific article; zbMATH DE number 7415078 (Why is no real title available?)
- An Alternating Method for Cardinality-Constrained Optimization: A Computational Study for the Best Subset Selection and Sparse Portfolio Problems
- Certifiably optimal sparse inverse covariance estimation
- Sparse convex optimization toolkit: a mixed-integer framework
- A discussion on practical considerations with sparse regression methodologies
- Solution sets of three sparse optimization problems for multivariate regression
- General feasibility bounds for sample average approximation via Vapnik-Chervonenkis dimension
- The EAS approach for graphical selection consistency in vector autoregression models
This page was built for publication: Best subset selection via a modern optimization lens
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q282479)