A significance test for the lasso (Q2249837): Difference between revisions
From MaRDI portal
Changed an Item |
Set profile property. |
||
Property / MaRDI profile type | |||
Property / MaRDI profile type: MaRDI publication profile / rank | |||
Normal rank |
Revision as of 06:26, 5 March 2024
scientific article
Language | Label | Description | Also known as |
---|---|---|---|
English | A significance test for the lasso |
scientific article |
Statements
A significance test for the lasso (English)
0 references
3 July 2014
0 references
A linear regression model is considered, \[ y=X\beta^*+\varepsilon,\quad \varepsilon\sim N(0, \sigma^2I), \] where \(y\in \mathbb{R}^n\) is an outcome vector, \(X\) is a design matrix, and \(\beta^*\in \mathbb{R}^p\) are unknown coefficients to be estimated. The lasso estimator \(\hat {\beta} =\hat {\beta} (\lambda)\) minimizes the objective function \[ Q(\beta; \lambda)=\frac{1}{2} \|y-X\beta\|_2^2+\lambda \|\beta\|_1,\quad \beta\in \mathbb{R}^p, \] where \(\lambda \geq 0\) is a tuning parameter, controlling the level of sparsity in \(\hat {\beta} \). It is assumed that the columns of \(X\) are in general position in order to ensure uniqueness of the lasso solution, see [\textit{R. J. Tibshirani}, Electron. J. Stat. 7, 1456--1490 (2013; Zbl 1337.62173)]. The path \(\hat {\beta} (\lambda)\) is a piecewise linear function, with knots at values \(\lambda_1 \geq \lambda_2 \geq \cdots \geq \lambda_r \geq 0\). At \(\lambda=\infty\), the solution \(\hat {\beta}(\infty)\) has no active variables, and for decreasing \(\lambda\), each knot \(\lambda_k\) marks the entry or removal of some variable from the current active set. At any \(\lambda \geq 0\), the corresponding active set \(A=\operatorname{supp}(\hat {\beta}(\lambda))\) indexes a linearly independent set of predictor variables, that is, \(\operatorname{rank}(X_A)=|A|\), where \(X_A\) denotes the columns of \(X\) in \(A\). Let \(A\) be the active set just before \(\lambda_k\), and suppose that predictor \(j\) enters at \(\lambda_k\). Denote by \(\hat {\beta}(\lambda_{k+1})\) the solution at point \(\lambda=\lambda_{k+1}\), using predictors \(A\) and \(j\). Let \(\tilde{\beta}_A (\lambda_{k+1})\) be the lasso solution using only the active predictors \(X_A\), at \(\lambda=\lambda_{k+1}\). In the paper under review, the \textit{covariance test statistic} is proposed, \[ T_k=\frac{1}{\sigma^2}(y, X\hat {\beta} (\lambda_{k+1})-X_A\tilde{\beta}_A (\lambda_{k+1})). \] The main result given in Theorem 3 states the following: under the null hypothesis that current lasso model contains all truly active variables, \(\operatorname{supp}(\beta^*) \subseteq A\), \(T_k\) is asymptotically distributed as a standard exponential random variable, given reasonable assumption on \(X\) and the magnitudes of the nonzero true coefficients. This statistic can be used to test the significance of an additional variable between two nested models, when this additional variable is not fixed and has been chosen adaptively. In Section 6, this result is modified for the case of unknown \(\sigma^2\). Section 8 discusses some extensions to the elastic net, generalized linear models, and the Cox proportional hazards model; the proposals there are supported by simulations, but no theory is offered.
0 references
lasso
0 references
least angle regression
0 references
\(p\)-value
0 references
significance test
0 references