Targeting underrepresented populations in precision medicine: a federated transfer learning approach
From MaRDI portal
Publication:6138616
DOI10.1214/23-AOAS1747arXiv2108.12112OpenAlexW3197203470MaRDI QIDQ6138616FDOQ6138616
Publication date: 16 January 2024
Published in: The Annals of Applied Statistics (Search for Journal in Brave)
Abstract: The limited representation of minorities and disadvantaged populations in large-scale clinical and genomics research has become a barrier to translating precision medicine research into practice. Due to heterogeneity across populations, risk prediction models are often found to be underperformed in these underrepresented populations, and therefore may further exacerbate known health disparities. In this paper, we propose a two-way data integration strategy that integrates heterogeneous data from diverse populations and from multiple healthcare institutions via a federated transfer learning approach. The proposed method can handle the challenging setting where sample sizes from different populations are highly unbalanced. With only a small number of communications across participating sites, the proposed method can achieve performance comparable to the pooled analysis where individual-level data are directly pooled together. We show that the proposed method improves the estimation and prediction accuracy in underrepresented populations, and reduces the gap of model performance across populations. Our theoretical analysis reveals how estimation accuracy is influenced by communication budgets, privacy restrictions, and heterogeneity across populations. We demonstrate the feasibility and validity of our methods through numerical experiments and a real application to a multi-center study, in which we construct polygenic risk prediction models for Type II diabetes in AA population.
Full work available at URL: https://arxiv.org/abs/2108.12112
Cites Work
- Title not available (Why is that?)
- Simultaneous analysis of Lasso and Dantzig selector
- High-dimensional generalized linear models and the lasso
- Optimal learning with \textit{Q}-aggregation
- Transfer Learning under High-dimensional Generalized Linear Models
- Transfer learning for nonparametric classification: minimax rate and adaptive classifier
- Transfer Learning for High-Dimensional Linear Regression: Prediction, Estimation and Minimax Optimality
- Exponential screening and optimal rates of sparse estimation
- A split-and-conquer approach for analysis of
- Title not available (Why is that?)
- Title not available (Why is that?)
- Title not available (Why is that?)
- Communication-Efficient Distributed Statistical Inference
- Title not available (Why is that?)
- Heterogeneity-aware and communication-efficient distributed statistical inference
- Individual Data Protected Integrative Regression Analysis of High-Dimensional Heterogeneous Data
Cited In (2)
This page was built for publication: Targeting underrepresented populations in precision medicine: a federated transfer learning approach
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6138616)