Pan-disease clustering analysis of the trend of period prevalence
From MaRDI portal
Publication:2078313
Abstract: For all diseases, prevalence has been carefully studied. In the "classic" paradigm, the prevalence of different diseases has usually been studied separately. Accumulating evidences have shown that diseases can be "correlated". The joint analysis of prevalence of multiple diseases can provide important insights beyond individual-disease analysis, however, has not been well conducted. In this study, we take advantage of the uniquely valuable Taiwan National Health Insurance Research Database (NHIRD), and conduct a pan-disease analysis of period prevalence trend. The goal is to identify clusters within which diseases share similar period prevalence trends. For this purpose, a novel penalization pursuit approach is developed, which has an intuitive formulation and satisfactory properties. In data analysis, the period prevalence values are computed using records on close to 1 million subjects and 14 years of observation. For 405 diseases, 35 nontrivial clusters (with sizes larger than one) and 27 trivial clusters (with sizes one) are identified. The results differ significantly from those of the alternatives. A closer examination suggests that the clustering results have sound interpretations. This study is the first to conduct a pan-disease clustering analysis of prevalence trend using the uniquely valuable NHIRD data and can have important value in multiple aspects.
Recommendations
- Spatial and Temporal Analysis of Disease Occurrence for Detection of Clustering
- Case-cohort analysis of clusters of recurrent events
- Investigating Disease Clusters: Why, When and How?
- A new relation between prevalence and incidence of a chronic disease
- Estimating disease prevalence in two-phase studies
Cites work
- Age-Specific Incidence and Prevalence: A Statistical Perspective
- Distance-based clustering of sparsely observed stochastic processes, with applications to online auctions
- Functional data analysis.
- Functional data clustering: a survey
- Fused Lasso approach in regression coefficients clustering -- learning parameter heterogeneity in data integration
- Grouping pursuit through a regularization solution surface
- Nonparametric incidence estimation from prevalent cohort survival data
- Statistics for high-dimensional data. Methods, theory and applications.
- The discriminative functional mixture model for a comparative analysis of bike sharing systems
- Unsupervised Curve Clustering using B‐Splines
This page was built for publication: Pan-disease clustering analysis of the trend of period prevalence
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2078313)