Efficient regularized regression with \(L_0\) penalty for variable selection and network construction (Q2011726): Difference between revisions

From MaRDI portal
Import240304020342 (talk | contribs)
Set profile property.
Set OpenAlex properties.
Property / full work available at URL
 
Property / full work available at URL: https://doi.org/10.1155/2016/3456153 / rank
 
Normal rank
Property / OpenAlex ID
 
Property / OpenAlex ID: W2532146222 / rank
 
Normal rank

Revision as of 19:10, 19 March 2024

scientific article
Language Label Description Also known as
English
Efficient regularized regression with \(L_0\) penalty for variable selection and network construction
scientific article

    Statements

    Efficient regularized regression with \(L_0\) penalty for variable selection and network construction (English)
    0 references
    0 references
    0 references
    4 August 2017
    0 references
    Summary: Variable selections for regression with high-dimensional big data have found many applications in bioinformatics and computational biology. One appealing approach is the \(L_0\) regularized regression which penalizes the number of nonzero features in the model directly. However, it is well known that \(L_0\) optimization is NP-hard and computationally challenging. In this paper, we propose efficient EM (\(L_0\)EM) and dual \(L_0\)EM (D\(L_0\)EM) algorithms that directly approximate the \(L_0\) optimization problem. While \(L_0\)EM is efficient with large sample size, D\(L_0\)EM is efficient with high-dimensional (\(n \ll m\)) data. They also provide a natural solution to all \(L_p\)\ \ \(p \in [0,2]\) problems, including lasso with \(p = 1\) and elastic net with \(p \in [1,2]\). The regularized parameter \(\lambda\) can be determined through cross validation or AIC and BIC. We demonstrate our methods through simulation and high-dimensional genomic data. The results indicate that \(L_0\) has better performance than lasso, SCAD, and MC+, and \(L_0\) with AIC or BIC has similar performance as computationally intensive cross validation. The proposed algorithms are efficient in identifying the nonzero variables with less bias and constructing biologically important networks with high-dimensional big data.
    0 references
    \(L_0\) regularized regression
    0 references
    high-dimensional big data
    0 references
    EM algorithms
    0 references

    Identifiers

    0 references
    0 references
    0 references
    0 references
    0 references
    0 references