A general framework for penalized mixed-effects multitask learning with applications on DNA methylation surrogate biomarkers creation
From MaRDI portal
Publication:6138641
DOI10.1214/23-AOAS1760arXiv2112.12719OpenAlexW4388088761MaRDI QIDQ6138641FDOQ6138641
Authors: Andrea Cappozzo, Francesca Ieva, Giovanni Fiorito
Publication date: 16 January 2024
Published in: The Annals of Applied Statistics (Search for Journal in Brave)
Abstract: Recent evidence highlights the usefulness of DNA methylation (DNAm) biomarkers as surrogates for exposure to risk factors for non-communicable diseases in epidemiological studies and randomized trials. DNAm variability has been demonstrated to be tightly related to lifestyle behavior and exposure to environmental risk factors, ultimately providing an unbiased proxy of an individual state of health. At present, the creation of DNAm surrogates relies on univariate penalized regression models, with elastic-net regularizer being the gold standard when accomplishing the task. Nonetheless, more advanced modeling procedures are required in the presence of multivariate outcomes with a structured dependence pattern among the study samples. In this work we propose a general framework for mixed-effects multitask learning in presence of high-dimensional predictors to develop a multivariate DNAm biomarker from a multi-center study. A penalized estimation scheme based on an expectation-maximization algorithm is devised, in which any penalty criteria for fixed-effects models can be conveniently incorporated in the fitting process. We apply the proposed methodology to create novel DNAm surrogate biomarkers for multiple correlated risk factors for cardiovascular diseases and comorbidities. We show that the proposed approach, modeling multiple outcomes together, outperforms state-of-the-art alternatives, both in predictive power and bio-molecular interpretation of the results.
Full work available at URL: https://arxiv.org/abs/2112.12719
EM algorithmmultivariate regressionpenalized estimationmixed-effects modelspersonalized medicinemultitask learning
Cites Work
- Linear mixed-effects models using R. A step-by-step approach
- A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis
- Extensions of sparse canonical correlation analysis with applications to genomic data
- Selection of fixed effects in high dimensional linear mixed models using a multicycle ECM algorithm
- Estimating the dimension of a model
- Title not available (Why is that?)
- Title not available (Why is that?)
- Mixed models. Theory and applications with R
- Maximum likelihood estimation via the ECM algorithm: A general framework
- Sure Independence Screening for Ultrahigh Dimensional Feature Space
- Model Selection and Estimation in Regression with Grouped Variables
- Variable selection and regression analysis for graph-structured covariates with an application to genomics
- Tree-guided group lasso for multi-response regression with structured sparsity, with an application to eQTL mapping
- The EM Algorithm and Extensions, 2E
- Adjusting batch effects in microarray expression data using empirical Bayes methods
- Estimation for high-dimensional linear mixed-effects models using \(\ell_1\)-penalization
- Some matrix-variate distribution theory: Notational considerations and a Bayesian application
- Misspecifying the shape of a random effects distribution: why getting it wrong may not matter
- A Random-Effects Model for Multiple Characteristics With Possibly Missing Data
- Ultrahigh dimensional feature selection: beyond the linear model
- Support union recovery in high-dimensional multivariate regression
- On statistics, computation and scalability
- Multivariate random effect models with complete and incomplete data
- Network-based penalized regression with application to genomic data
- Estimation and Prediction in a Multivariate Random Effects Generalized Linear Model
- Censored mean variance sure independence screening for ultrahigh dimensional survival data
- Multivariate sparse group Lasso for the multivariate multiple linear regression with an arbitrary group structure
- An Iterative Sparse-Group Lasso
This page was built for publication: A general framework for penalized mixed-effects multitask learning with applications on DNA methylation surrogate biomarkers creation
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6138641)