Log-ratio Lasso: scalable, sparse estimation for log-ratio models

DOI10.1111/BIOM.12995zbMATH Open1436.62510arXiv1709.01139OpenAlexW2963899886WikidataQ58615599 ScholiaQ58615599MaRDI QIDQ5214522FDOQ5214522

Authors: Stephen Bates, Robert Tibshirani

Publication date: 7 February 2020

Published in: Biometrics (Search for Journal in Brave)

Abstract: Positive-valued signal data is common in many biological and medical applications, where the data are often generated from imaging techniques such as mass spectrometry. In such a setting, the relative intensities of the raw features are often the scientifically meaningful quantities, so it is of interest to identify relevant features that take the form of log-ratios of the raw inputs. When including the log-ratios of all pairs of predictors, the dimensionality of this predictor space becomes large, so computationally efficient statistical procedures are required. We introduce an embedding of the log-ratio parameter space into a space of much lower dimension and develop efficient penalized fitting procedure using this more tractable representation. This procedure serves as the foundation for a two-step fitting procedure that combines a convex filtering step with a second non-convex pruning step to yield highly sparse solutions. On a cancer proteomics data set we find that these methods fit highly sparse models with log-ratio features of known biological relevance while greatly improving upon the predictive accuracy of less interpretable methods.

Full work available at URL: https://arxiv.org/abs/1709.01139

Recommendations

zbMATH Keywords

Lasso variable selection compositional data mass spectrometry log-ratio

Mathematics Subject Classification ID

Applications of statistics to biology and medical sciences; meta analysis (62P10) Ridge regression; shrinkage estimators (Lasso) (62J07)

Cited In (5)

This page was built for publication: Log-ratio Lasso: scalable, sparse estimation for log-ratio models

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5214522)