A phase transition for finding needles in nonlinear haystacks with LASSO artificial neural networks
From MaRDI portal
Publication:2103975
Abstract: To fit sparse linear associations, a LASSO sparsity inducing penalty with a single hyperparameter provably allows to recover the important features (needles) with high probability in certain regimes even if the sample size is smaller than the dimension of the input vector (haystack). More recently learners known as artificial neural networks (ANN) have shown great successes in many machine learning tasks, in particular fitting nonlinear associations. Small learning rate, stochastic gradient descent algorithm and large training set help to cope with the explosion in the number of parameters present in deep neural networks. Yet few ANN learners have been developed and studied to find needles in nonlinear haystacks. Driven by a single hyperparameter, our ANN learner, like for sparse linear associations, exhibits a phase transition in the probability of retrieving the needles, which we do not observe with other ANN learners. To select our penalty parameter, we generalize the universal threshold of Donoho and Johnstone (1994) which is a better rule than the conservative (too many false detections) and expensive cross-validation. In the spirit of simulated annealing, we propose a warm-start sparsity inducing algorithm to solve the high-dimensional, non-convex and non-differentiable optimization problem. We perform precise Monte Carlo simulations to show the effectiveness of our approach.
Recommendations
- False discoveries occur early on the Lasso path
- Taming Neural Networks with TUSLA: Nonconvex Learning via Adaptive Stochastic Gradient Langevin Algorithms
- Asymptotic properties of one-layer artificial neural networks with sparse connectivity
- Consistent Sparse Deep Learning: Theory and Computation
- Testing for neglected nonlinearity using artificial neural networks with many randomized hidden unit activations
Cites work
- scientific article; zbMATH DE number 6378127 (Why is no real title available?)
- scientific article; zbMATH DE number 3860199 (Why is no real title available?)
- scientific article; zbMATH DE number 739533 (Why is no real title available?)
- scientific article; zbMATH DE number 845714 (Why is no real title available?)
- A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems
- A survey of cross-validation procedures for model selection
- Approximation by superpositions of a sigmoidal function
- Atomic Decomposition by Basis Pursuit
- Compressed sensing
- Consistent Sparse Deep Learning: Theory and Computation
- Covariance stabilizing transformations
- Decoding by Linear Programming
- Deep Neural Network Approximation Theory
- High-dimensional dynamics of generalization error in neural networks
- Ideal spatial adaptation by wavelet shrinkage
- Learning representations by back-propagating errors
- Make \(\ell_1\) regularization effective in training sparse CNN
- Model Selection With Lasso-Zero: Adding Straw to the Haystack to Better Find Needles
- Model Selection and Estimation in Regression with Grouped Variables
- Needles and straw in haystacks: Empirical Bayes estimates of possibly sparse sequences
- Optimal approximation with sparsely connected deep neural networks
- Optimization with sparsity-inducing penalties
- Quantile universal threshold
- Random forests
- Ridge Regression: Biased Estimation for Nonorthogonal Problems
- Scaling description of generalization with number of parameters in deep learning
- Square-root lasso: pivotal recovery of sparse signals via conic programming
- Statistics for high-dimensional data. Methods, theory and applications.
- Surprises in high-dimensional ridgeless least squares interpolation
- The Generalization Error of Random Features Regression: Precise Asymptotics and the Double Descent Curve
- The Noise-Sensitivity Phase Transition in Compressed Sensing
- The gap between theory and practice in function approximation with deep neural networks
- Transformed \(\ell_1\) regularization for learning sparse deep neural networks
- Universal approximation bounds for superpositions of a sigmoidal function
This page was built for publication: A phase transition for finding needles in nonlinear haystacks with LASSO artificial neural networks
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2103975)