Best subset selection via a modern optimization lens
From MaRDI portal
(Redirected from Publication:282479)
Abstract: In the last twenty-five years (1990-2014), algorithmic advances in integer optimization combined with hardware improvements have resulted in an astonishing 200 billion factor speedup in solving Mixed Integer Optimization (MIO) problems. We present a MIO approach for solving the classical best subset selection problem of choosing out of features in linear regression given observations. We develop a discrete extension of modern first order continuous optimization methods to find high quality feasible solutions that we use as warm starts to a MIO solver that finds provably optimal solutions. The resulting algorithm (a) provides a solution with a guarantee on its suboptimality even if we terminate the algorithm early, (b) can accommodate side constraints on the coefficients of the linear regression and (c) extends to finding best subset solutions for the least absolute deviation loss function. Using a wide variety of synthetic and real datasets, we demonstrate that our approach solves problems with in the 1000s and in the 100s in minutes to provable optimality, and finds near optimal solutions for in the 100s and in the 1000s in minutes. We also establish via numerical experiments that the MIO approach performs better than { exttt {Lasso}} and other popularly used sparse learning procedures, in terms of achieving sparse solutions with good predictive power.
Recommendations
- OR forum: An algorithmic approach to linear regression
- An efficient optimization approach for best subset selection in linear regression, with application to model selection and fitting in autoregressive time-series
- Fast best subset selection: coordinate descent and local combinatorial optimization algorithms
- A new approach to select the best subset of predictors in linear regression modelling: bi-objective mixed integer linear programming
- An effective procedure for feature subset selection in logistic regression based on information criteria
Cites work
- scientific article; zbMATH DE number 5957408 (Why is no real title available?)
- scientific article; zbMATH DE number 1906319 (Why is no real title available?)
- scientific article; zbMATH DE number 845714 (Why is no real title available?)
- A Statistical View of Some Chemometrics Regression Tools
- A brief history of linear and mixed-integer programming computation
- A general theory of concave regularization for high-dimensional sparse estimation problems
- A unified approach to model selection and sparse recovery using regularized least squares
- Aggregation for Gaussian regression
- Algorithm for cardinality-constrained quadratic optimization
- Analysis of multi-stage convex relaxation for sparse regularization
- Asymptotic Equivalence of Regularization Methods in Thresholded Parameter Space
- Asymptotics for Lasso-type estimators.
- Atomic Decomposition by Basis Pursuit
- Best subset selection via a modern optimization lens
- Best subset selection, persistence in high-dimensional statistical learning and optimization under l₁ constraint
- Certifying the Restricted Isometry Property is Hard
- Computational study of a family of mixed-integer quadratic programming problems
- Enhancing sparsity by reweighted \(\ell _{1}\) minimization
- For most large underdetermined systems of linear equations the minimal 𝓁1‐norm solution is also the sparsest solution
- Gradient methods for minimizing composite functions
- Graph puzzles, homotopy, and the alternating group
- High-dimensional graphs and variable selection with the Lasso
- Ideal spatial adaptation by wavelet shrinkage
- Introductory lectures on convex optimization. A basic course.
- Iterative hard thresholding for compressed sensing
- Iterative thresholding for sparse approximations
- Just relax: convex programming methods for identifying sparse signals in noise
- Least angle regression. (With discussion)
- Least quantile regression via modern optimization
- Minimax Rates of Estimation for High-Dimensional Linear Regression Over $\ell_q$-Balls
- Near-Optimal Signal Recovery From Random Projections: Universal Encoding Strategies?
- Near-ideal model selection by \(\ell _{1}\) minimization
- Nearly unbiased variable selection under minimax concave penalty
- Nonconcave Penalized Likelihood With NP-Dimensionality
- On constrained and regularized high-dimensional regression
- One-step sparse estimates in nonconcave penalized likelihood models
- Optimally sparse representation in general (nonorthogonal) dictionaries via ℓ 1 minimization
- Pathwise coordinate optimization
- Persistene in high-dimensional linear predictor-selection and the virtue of overparametrization
- Sharp Thresholds for High-Dimensional and Noisy Sparsity Recovery Using $\ell _{1}$-Constrained Quadratic Programming (Lasso)
- Simultaneous analysis of Lasso and Dantzig selector
- Smooth minimization of non-smooth functions
- Sparse Approximate Solutions to Linear Systems
- SparseNet: coordinate descent with nonconvex penalties
- Statistics for high-dimensional data. Methods, theory and applications.
- The Adaptive Lasso and Its Oracle Properties
- The adaptive and the thresholded Lasso for potentially misspecified models (and a lower bound for the Lasso)
- The restricted isometry property and its implications for compressed sensing
- The sparsity and bias of the LASSO selection in high-dimensional linear regression
- Uncertainty principles and ideal atomic decomposition
- Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties
Cited in
(only showing first 100 items - show all)- Linear Time Dynamic Programming for Computing Breakpoints in the Regularization Path of Models Selected From a Finite Set
- scientific article; zbMATH DE number 5054864 (Why is no real title available?)
- A polynomial algorithm for best-subset selection problem
- A duality-based method for identifying elemental balance violations in metabolic network models
- Improved RIP-based bounds for guaranteed performance of two compressed sensing algorithms
- Optimization problems for machine learning: a survey
- A unified primal dual active set algorithm for nonconvex sparse recovery
- Smoothed quantile regression with large-scale inference
- An efficient optimization approach for best subset selection in linear regression, with application to model selection and fitting in autoregressive time-series
- scientific article; zbMATH DE number 7306911 (Why is no real title available?)
- Best subset selection with shrinkage: sparse additive hazards regression with the grouping effect
- Scaled proximal gradient methods for sparse optimization problems
- The min-Knapsack problem with compactness constraints and applications in statistics
- Predicting the equity market risk premium: a model selection approach
- A First-Order Optimization Algorithm for Statistical Learning with Hierarchical Sparsity Structure
- Structural properties of affine sparsity constraints
- Blessing of massive scale: spatial graphical model estimation with a total cardinality constraint approach
- Time series modeling and forecasting by mathematical programming
- High-dimensional variable selection via low-dimensional adaptive learning
- An effective procedure for feature subset selection in logistic regression based on information criteria
- Sparse Convex Regression
- Sparse quantile regression
- An \(\ell_{2,0}\)-norm constrained matrix optimization via extended discrete first-order algorithms
- Inference in a Class of Optimization Problems: Confidence Regions and Finite Sample Bounds on Errors in Coverage Probabilities
- Supervised homogeneity fusion: a combinatorial approach
- Variable selection in linear regression models: choosing the best subset is not always the best choice
- Relative sparsity for medical decision problems
- The backbone method for ultra-high dimensional sparse machine learning
- A first derivative Potts model for segmentation and denoising using ILP
- Efficient subset selection for the expected opportunity cost
- Mixed integer quadratic optimization formulations for eliminating multicollinearity based on variance inflation factor
- Graph structured sparse subset selection
- Best subset, forward stepwise or Lasso? Analysis and recommendations based on extensive comparisons
- On the sensitivity of the Lasso to the number of predictor variables
- Sparsifying the least-squares approach to PCA: comparison of lasso and cardinality constraint
- Mixed-integer quadratic programming reformulations of multi-task learning models
- Subset Selection and the Cone of Factor-Width-k Matrices
- L 0 -regularization for high-dimensional regression with corrupted data
- A New Sparse-Learning Model for Maximum Gap Reduction of Composite Fuselage Assembly
- Constrained optimization of rank-one functions with indicator variables
- Investigating consumers' store-choice behavior via hierarchical variable selection
- Detecting racial bias in jury selection
- A solution approach for cardinality minimization problem based on fractional programming
- OR forum: An algorithmic approach to linear regression
- Bayesian selection of best subsets via hybrid search
- Resolvability of Hamming graphs
- Subset selection for multiple linear regression via optimization
- Linear biomarker combination for constrained classification
- Sparse index tracking using sequential Monte Carlo
- Best subset selection via a modern optimization lens
- A new perspective on low-rank optimization
- A mixed-integer exponential cone programming formulation for feature subset selection in logistic regression
- Variable selection in additive models via hierarchical sparse penalty
- Post-estimation shrinkage in full and selected linear regression models in low-dimensional data revisited
- On the convexification of constrained quadratic optimization problems with indicator variables
- Sparse hierarchical regression with polynomials
- On cutting planes for cardinality-constrained linear programs
- Ideal formulations for constrained convex optimization problems with indicator variables
- Optimal Subsampling via Predictive Inference
- Environment invariant linear least squares
- Bootstrap estimation of the proportion of outliers in robust regression
- Stab-GKnock: controlled variable selection for partially linear models using generalized knockoffs
- Cardinality-Constrained Multi-objective Optimization: Novel Optimality Conditions and Algorithms
- COMBSS: best subset selection via continuous optimization
- Learning optimized risk scores
- scientific article; zbMATH DE number 7415090 (Why is no real title available?)
- Strong formulations for quadratic optimization with M-matrices and indicator variables
- Smoothing Newton method for \(\ell^0\)-\(\ell^2\) regularized linear inverse problem
- Matrix completion with nonconvex regularization: spectral operators and scalable algorithms
- A guide for sparse PCA: model comparison and applications
- Robust subset selection
- A Computational Framework for Solving Nonlinear Binary Optimization Problems in Robust Causal Inference
- Stability Representations of Many-to-One Matching Problems: An Integer Optimization Approach
- Fast and approximate exhaustive variable selection for generalised linear models with APES
- A unified framework for bivariate clustering and regression problems via mixed-integer linear programming
- Communication-efficient estimation for distributed subset selection
- Best subset selection via cross-validation criterion
- A new model for counterfactual analysis for functional data
- Distributed primal outer approximation algorithm for sparse convex programming with separable structures
- Learning sparse nonlinear dynamics via mixed-integer optimization
- Polyhedral results and stronger Lagrangean bounds for stable spanning trees
- New bounds for subset selection from conic relaxations
- Mining events with declassified diplomatic documents
- An offline/online procedure for dual norm calculations of parameterized functionals: empirical quadrature and empirical test spaces
- Integer constraints for enhancing interpretability in linear regression
- Compositional knockoff filter for high‐dimensional regression analysis of microbiome data
- Regularized scalar-on-function regression analysis to assess functional association of critical physical activity window with biological age
- Group Sparse Optimization for Images Recovery Using Capped Folded Concave Functions
- scientific article; zbMATH DE number 6982301 (Why is no real title available?)
- A Scalable Algorithm for Sparse Portfolio Selection
- A greedy feature selection algorithm for big data of high dimensionality
- Variable selection in convex quantile regression: \(\mathcal{L}_1\)-norm or \(\mathcal{L}_0\)-norm regularization?
- Convergence properties of stochastic proximal subgradient method in solving a class of composite optimization problems with cardinality regularizer
- A robust knockoff filter for sparse regression analysis of microbiome compositional data
- A constrained minimum method for model selection
- Linear screening for high-dimensional computer experiments
- Provably optimal sparse solutions to overdetermined linear systems with non-negativity constraints in a least-squares sense by implicit enumeration
- Linear regression with sparsely permuted data
- Sparse regression at scale: branch-and-bound rooted in first-order optimization
- Subset Selection for Linear Mixed Models
This page was built for publication: Best subset selection via a modern optimization lens
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q282479)