Flexible Variable Selection for Clustering and Classification

DOI10.48550/ARXIV.2305.16464MaRDI QIDQ5980940arXivFDO

Authors Paul D. McNicholas, MacKenzie R. Neal

Publication date 25 May 2023

Abstract: The importance of variable selection for clustering has been recognized for some time, and mixture models are well-established as a statistical approach to clustering. Yet, the literature on variable selection in model-based clustering remains largely rooted in the assumption of Gaussian clusters. Unsurprisingly, variable selection algorithms based on this assumption tend to break down in the presence of cluster skewness. A novel variable selection algorithm is presented that utilizes the Manly transformation mixture model to select variables based on their ability to separate clusters, and is effective even when clusters depart from the Gaussian assumption. The proposed approach, which is implemented within the R package vscc, is compared to existing variable selection methods -- including an existing method that can account for cluster skewness -- using simulated and real datasets.

Cited in

(1)

vscc

This page was built for publication: Flexible Variable Selection for Clustering and Classification

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5980940)