Pursuing Sources of Heterogeneity in Modeling Clustered Population
From MaRDI portal
Abstract: Researchers often have to deal with heterogeneous population with mixed regression relationships, increasingly so in the era of data explosion. In such problems, when there are many candidate predictors, it is not only of interest to identify the predictors that are associated with the outcome, but also to distinguish the true sources of heterogeneity, i.e., to identify the predictors that have different effects among the clusters and thus are the true contributors to the formation of the clusters. We clarify the concepts of the source of heterogeneity that account for potential scale differences of the clusters and propose a regularized finite mixture effects regression to achieve heterogeneity pursuit and feature selection simultaneously. As the name suggests, the problem is formulated under an effects-model parameterization, in which the cluster labels are missing and the effect of each predictor on the outcome is decomposed to a common effect term and a set of cluster-specific terms. A constrained sparse estimation of these effects leads to the identification of both the variables with common effects and those with heterogeneous effects. We propose an efficient algorithm and show that our approach can achieve both estimation and selection consistency. Simulation studies further demonstrate the effectiveness of our method under various practical scenarios. Three applications are presented, namely, an imaging genetics study for linking genetic factors and brain neuroimaging traits in Alzheimer's disease, a public health study for exploring the association between suicide risk among adolescents and their school district characteristics, and a sport analytics study for understanding how the salary levels of baseball players are associated with their performance and contractual status.
Recommendations
- scientific article; zbMATH DE number 3925960
- scientific article; zbMATH DE number 3936992
- Modeling and estimation problems for structured heterogeneous populations
- The effect of clumped population structure on the variability of spreading dynamics
- scientific article; zbMATH DE number 5245021
- Modelling Clustered Heterogeneity: Fixed Effects, Random Effects and Mixtures
- A heterogeneity measure for cluster identification with application to disease mapping
- scientific article; zbMATH DE number 701763
- Mixture models applied to heterogeneous populations
Cites work
- scientific article; zbMATH DE number 3567782 (Why is no real title available?)
- scientific article; zbMATH DE number 845714 (Why is no real title available?)
- A Markov model for switching regressions
- A New Semiparametric Approach to Finite Mixture of Regressions using Penalized Regression via Fusion
- A mixture likelihood approach for generalized linear models
- A tailored multivariate mixture model for detecting proteins of concordant change among virulent strains of \textit{clostridium perfringens}
- Adaptive Lasso for sparse high-dimensional regression models
- An overview of the new feature selection methods in finite mixture of regression models
- Estimation of multiple networks in Gaussian mixture models
- Hierarchical mixtures-of-experts for exponential family regression models: Approximation and maximum likelihood estimation
- Individualized Multidirectional Variable Selection
- Maximum likelihood estimation via the ECM algorithm: A general framework
- Mixture of linear mixed models using multivariate \(t\) distribution
- Multi-species distribution modeling using penalized mixture of regressions
- Parameter estimation for mixtures of skew Laplace normal distributions and application in mixture regression modeling
- Regularization and Variable Selection Via the Elastic Net
- Regularization in finite mixture of regression models with diverging number of parameters
- Sparse regression with exact clustering
- The Adaptive Lasso and Its Oracle Properties
- The Split Bregman Method for L1-Regularized Problems
- The solution path of the generalized lasso
- Variable Selection in Finite Mixture of Regression Models
- Variable Selection in Penalized Model‐Based Clustering Via Regularization on Grouped Parameters
- Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties
- \(\ell_{1}\)-penalization for mixture regression models
Cited in
(5)
This page was built for publication: Pursuing Sources of Heterogeneity in Modeling Clustered Population
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q130716)