Outlier detection using nonconvex penalized regression

From MaRDI portal
Publication:3095181

DOI10.1198/JASA.2011.TM10390zbMATH Open1232.62068arXiv1006.2592OpenAlexW1969515697MaRDI QIDQ3095181FDOQ3095181


Authors: Yiyuan She, Art B. Owen Edit this on Wikidata


Publication date: 28 October 2011

Published in: Journal of the American Statistical Association (Search for Journal in Brave)

Abstract: This paper studies the outlier detection problem from the point of view of penalized regressions. Our regression model adds one mean shift parameter for each of the n data points. We then apply a regularization favoring a sparse vector of mean shift parameters. The usual L1 penalty yields a convex criterion, but we find that it fails to deliver a robust estimator. The L1 penalty corresponds to soft thresholding. We introduce a thresholding (denoted by Theta) based iterative procedure for outlier detection (Theta-IPOD). A version based on hard thresholding correctly identifies outliers on some hard test problems. We find that Theta-IPOD is much faster than iteratively reweighted least squares for large data because each iteration costs at most O(np) (and sometimes much less) avoiding an O(np2) least squares estimate. We describe the connection between Theta-IPOD and M-estimators. Our proposed method has one tuning parameter with which to both identify outliers and estimate regression coefficients. A data-dependent choice can be made based on BIC. The tuned Theta-IPOD shows outstanding performance in identifying outliers in various situations in comparison to other existing approaches. This methodology extends to high-dimensional modeling with pggn, if both the coefficient vector and the outlier pattern are sparse.


Full work available at URL: https://arxiv.org/abs/1006.2592




Recommendations





Cited In (99)

Uses Software





This page was built for publication: Outlier detection using nonconvex penalized regression

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q3095181)