Best subset selection via a modern optimization lens
From MaRDI portal
Abstract: In the last twenty-five years (1990-2014), algorithmic advances in integer optimization combined with hardware improvements have resulted in an astonishing 200 billion factor speedup in solving Mixed Integer Optimization (MIO) problems. We present a MIO approach for solving the classical best subset selection problem of choosing out of features in linear regression given observations. We develop a discrete extension of modern first order continuous optimization methods to find high quality feasible solutions that we use as warm starts to a MIO solver that finds provably optimal solutions. The resulting algorithm (a) provides a solution with a guarantee on its suboptimality even if we terminate the algorithm early, (b) can accommodate side constraints on the coefficients of the linear regression and (c) extends to finding best subset solutions for the least absolute deviation loss function. Using a wide variety of synthetic and real datasets, we demonstrate that our approach solves problems with in the 1000s and in the 100s in minutes to provable optimality, and finds near optimal solutions for in the 100s and in the 1000s in minutes. We also establish via numerical experiments that the MIO approach performs better than { exttt {Lasso}} and other popularly used sparse learning procedures, in terms of achieving sparse solutions with good predictive power.
Recommendations
- OR forum: An algorithmic approach to linear regression
- An efficient optimization approach for best subset selection in linear regression, with application to model selection and fitting in autoregressive time-series
- Fast best subset selection: coordinate descent and local combinatorial optimization algorithms
- A new approach to select the best subset of predictors in linear regression modelling: bi-objective mixed integer linear programming
- An effective procedure for feature subset selection in logistic regression based on information criteria
Cites work
- scientific article; zbMATH DE number 5957408 (Why is no real title available?)
- scientific article; zbMATH DE number 1906319 (Why is no real title available?)
- scientific article; zbMATH DE number 845714 (Why is no real title available?)
- A Statistical View of Some Chemometrics Regression Tools
- A brief history of linear and mixed-integer programming computation
- A general theory of concave regularization for high-dimensional sparse estimation problems
- A unified approach to model selection and sparse recovery using regularized least squares
- Aggregation for Gaussian regression
- Algorithm for cardinality-constrained quadratic optimization
- Analysis of multi-stage convex relaxation for sparse regularization
- Asymptotic Equivalence of Regularization Methods in Thresholded Parameter Space
- Asymptotics for Lasso-type estimators.
- Atomic Decomposition by Basis Pursuit
- Best subset selection via a modern optimization lens
- Best subset selection, persistence in high-dimensional statistical learning and optimization under \(l_1\) constraint
- Certifying the Restricted Isometry Property is Hard
- Computational study of a family of mixed-integer quadratic programming problems
- Enhancing sparsity by reweighted \(\ell _{1}\) minimization
- For most large underdetermined systems of linear equations the minimal 𝓁1‐norm solution is also the sparsest solution
- Gradient methods for minimizing composite functions
- Graph puzzles, homotopy, and the alternating group
- High-dimensional graphs and variable selection with the Lasso
- Ideal spatial adaptation by wavelet shrinkage
- Introductory lectures on convex optimization. A basic course.
- Iterative hard thresholding for compressed sensing
- Iterative thresholding for sparse approximations
- Just relax: convex programming methods for identifying sparse signals in noise
- Least angle regression. (With discussion)
- Least quantile regression via modern optimization
- Minimax Rates of Estimation for High-Dimensional Linear Regression Over $\ell_q$-Balls
- Near-Optimal Signal Recovery From Random Projections: Universal Encoding Strategies?
- Near-ideal model selection by \(\ell _{1}\) minimization
- Nearly unbiased variable selection under minimax concave penalty
- Nonconcave Penalized Likelihood With NP-Dimensionality
- On constrained and regularized high-dimensional regression
- One-step sparse estimates in nonconcave penalized likelihood models
- Optimally sparse representation in general (nonorthogonal) dictionaries via ℓ 1 minimization
- Pathwise coordinate optimization
- Persistene in high-dimensional linear predictor-selection and the virtue of overparametrization
- Sharp Thresholds for High-Dimensional and Noisy Sparsity Recovery Using $\ell _{1}$-Constrained Quadratic Programming (Lasso)
- Simultaneous analysis of Lasso and Dantzig selector
- Smooth minimization of non-smooth functions
- Sparse Approximate Solutions to Linear Systems
- SparseNet: coordinate descent with nonconvex penalties
- Statistics for high-dimensional data. Methods, theory and applications.
- The Adaptive Lasso and Its Oracle Properties
- The adaptive and the thresholded Lasso for potentially misspecified models (and a lower bound for the Lasso)
- The restricted isometry property and its implications for compressed sensing
- The sparsity and bias of the LASSO selection in high-dimensional linear regression
- Uncertainty principles and ideal atomic decomposition
- Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties
Cited in
(only showing first 100 items - show all)- A two-stage approach to the UCITS-constrained index-tracking problem
- Combinatorial optimization. Abstracts from the workshop held November 7--13, 2021 (hybrid meeting)
- Leveraged least trimmed absolute deviations
- Locating hyperplanes to fitting set of points: a general framework
- Smoothed quantile regression with large-scale inference
- Integer constraints for enhancing interpretability in linear regression
- Mixed integer quadratic optimization formulations for eliminating multicollinearity based on variance inflation factor
- On the convexification of constrained quadratic optimization problems with indicator variables
- On cutting planes for cardinality-constrained linear programs
- Sparse regression: scalable algorithms and empirical performance
- Subset selection for multiple linear regression via optimization
- A First-Order Optimization Algorithm for Statistical Learning with Hierarchical Sparsity Structure
- Resolvability of Hamming graphs
- Sparse hierarchical regression with polynomials
- Factor-driven two-regime regression
- Predicting the equity market risk premium: a model selection approach
- Tractable ADMM schemes for computing KKT points and local minimizers for \(\ell_0\)-minimization problems
- An attention algorithm for solving large scale structured \(l_0\)-norm penalty estimation problems
- Fitting sparse linear models under the sufficient and necessary condition for model identification
- Convergent inexact penalty decomposition methods for cardinality-constrained problems
- Optimal $k$-Thresholding Algorithms for Sparse Optimization Problems
- Fast best subset selection: coordinate descent and local combinatorial optimization algorithms
- Best subset selection via a modern optimization lens
- Ideal formulations for constrained convex optimization problems with indicator variables
- DC formulations and algorithms for sparse optimization problems
- Best subset binary prediction
- Certifiably optimal sparse inverse covariance estimation
- L0-Regularized Learning for High-Dimensional Additive Hazards Regression
- Cost-based feature selection for support vector machines: an application in credit scoring
- Fast and approximate exhaustive variable selection for generalised linear models with APES
- A greedy feature selection algorithm for big data of high dimensionality
- Efficient subset selection for the expected opportunity cost
- scientific article; zbMATH DE number 7370569 (Why is no real title available?)
- Provably optimal sparse solutions to overdetermined linear systems with non-negativity constraints in a least-squares sense by implicit enumeration
- An offline/online procedure for dual norm calculations of parameterized functionals: empirical quadrature and empirical test spaces
- An efficient optimization approach for best subset selection in linear regression, with application to model selection and fitting in autoregressive time-series
- Feature subset selection for logistic regression via mixed integer optimization
- Matrix completion with nonconvex regularization: spectral operators and scalable algorithms
- A Scalable Algorithm for Sparse Portfolio Selection
- Non-marginal feature screening for additive hazard model with ultrahigh-dimensional covariates
- Investigating consumers' store-choice behavior via hierarchical variable selection
- scientific article; zbMATH DE number 3940420 (Why is no real title available?)
- Linear regression with sparsely permuted data
- Strong formulations for quadratic optimization with M-matrices and indicator variables
- scientific article; zbMATH DE number 5054864 (Why is no real title available?)
- Optimization problems for machine learning: a survey
- Sparse Convex Regression
- Cost-sensitive feature selection for support vector machines
- Sparse regression at scale: branch-and-bound rooted in first-order optimization
- Optimal classification trees
- Structural properties of affine sparsity constraints
- Semi-automated simultaneous predictor selection for regression-SARIMA models
- On the Solution of ℓ0-Constrained Sparse Inverse Covariance Estimation Problems
- On the sensitivity of the Lasso to the number of predictor variables
- A discussion on practical considerations with sparse regression methodologies
- Blessing of massive scale: spatial graphical model estimation with a total cardinality constraint approach
- Simplex QP-based methods for minimizing a conic quadratic objective over polyhedra
- scientific article; zbMATH DE number 6982301 (Why is no real title available?)
- Sparse high-dimensional regression: exact scalable algorithms and phase transitions
- Lagrangian duality and saddle points for sparse linear programming
- Extending AIC to best subset regression
- Logistic regression: from art to science
- A Bayesian perspective of statistical machine learning for big data
- Sparse regression over clusters: SparClur
- Sparse optimization via vector \(k\)-norm and DC programming with an application to feature selection for support vector machines
- Linear biomarker combination for constrained classification
- A unified approach to mixed-integer optimization problems with logical constraints
- Sparse classification: a scalable discrete optimization perspective
- An Alternating Method for Cardinality-Constrained Optimization: A Computational Study for the Best Subset Selection and Sparse Portfolio Problems
- OR forum: An algorithmic approach to linear regression
- Certifiably optimal sparse principal component analysis
- Scalable algorithms for the sparse ridge regression
- Piecewise Linear Function Fitting via Mixed-Integer Linear Programming
- Solving nonnegative sparsity-constrained optimization via DC quadratic-piecewise-linear approximations
- A new perspective on low-rank optimization
- Constructing two-level \(Q_B\)-optimal screening designs using mixed-integer programming and heuristic algorithms
- Optimization for L1-Norm Error Fitting via Data Aggregation
- A first derivative Potts model for segmentation and denoising using ILP
- Divide-and-conquer information-based optimal subdata selection algorithm
- L 0 -regularization for high-dimensional regression with corrupted data
- Mathematical programming for simultaneous feature selection and outlier detection under l1 norm
- scientific article; zbMATH DE number 7415090 (Why is no real title available?)
- scientific article; zbMATH DE number 7415078 (Why is no real title available?)
- Sparse regression for data-driven deterrence functions in gravity models
- Improved RIP-based bounds for guaranteed performance of two compressed sensing algorithms
- A polynomial algorithm for best-subset selection problem
- A New Sparse-Learning Model for Maximum Gap Reduction of Composite Fuselage Assembly
- A sketch-and-select Arnoldi process
- A reformulation technique to solve polynomial optimization problems with separable objective functions of bounded integer variables
- Constrained optimization of rank-one functions with indicator variables
- Deconfounding and Causal Regularisation for Stability and External Validity
- Discussion of ``Best subset, forward stepwise or Lasso? Analysis and recommendations based on extensive comparisons
- High-Dimensional Cost-constrained Regression Via Nonconvex Optimization
- An \(\ell_{2,0}\)-norm constrained matrix optimization via extended discrete first-order algorithms
- Sparse quantile regression
- A new model for counterfactual analysis for functional data
- Grouped variable selection with discrete optimization: computational and statistical perspectives
- Inference in a Class of Optimization Problems: Confidence Regions and Finite Sample Bounds on Errors in Coverage Probabilities
- Regularized scalar-on-function regression analysis to assess functional association of critical physical activity window with biological age
- On sparse ensemble methods: an application to short-term predictions of the evolution of COVID-19
This page was built for publication: Best subset selection via a modern optimization lens
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q282479)