Variable selection for clustering and classification
From MaRDI portal
Abstract: As data sets continue to grow in size and complexity, effective and efficient techniques are needed to target important features in the variable space. Many of the variable selection techniques that are commonly used alongside clustering algorithms are based upon determining the best variable subspace according to model fitting in a stepwise manner. These techniques are often computationally intensive and can require extended periods of time to run; in fact, some are prohibitively computationally expensive for high-dimensional data. In this paper, a novel variable selection technique is introduced for use in clustering and classification analyses that is both intuitive and computationally efficient. We focus largely on applications in mixture model-based learning, but the technique could be adapted for use with various other clustering/classification methods. Our approach is illustrated on both simulated and real data, highlighted by contrasting its performance with that of other comparable variable selection techniques on the real data sets.
Recommendations
- Variable Selection for Model-Based Clustering
- Variable selection in model-based clustering and discriminant analysis with a regularization approach
- Selection of variables in cluster analysis: An empirical comparison of eight procedures
- Variable selection methods for model-based clustering
- Variable Selection for Model-Based High-Dimensional Clustering and Its Application to Microarray Data
Cites work
- A framework for feature selection in clustering
- Bayes Factors
- Dimensionally reduced model-based clustering through mixtures of factor mixture analyzers
- Enhanced model-based clustering, density estimation, and discriminant analysis software:\newline MCLUST
- Estimating the dimension of a model
- Extending mixtures of multivariate \(t\)-factor analyzers
- Generation of random clusters with specified degree of separation
- Heteroscedastic factor mixture analysis
- Influence of lattice vibration on the ground state of magnetopolaron in a parabolic quantum dot
- Mixtures of modified \(t\)-factor analyzers for model-based clustering, classification, and discriminant analysis
- Model-Based Clustering, Discriminant Analysis, and Density Estimation
- Model-based cluster and discriminant analysis with the MIXMOD software
- Simultaneous model-based clustering and visualization in the Fisher discriminative subspace
- Statistical Analysis of Financial Data in S-Plus
- Variable Selection for Clustering with Gaussian Mixture Models
- Variable Selection for Model-Based Clustering
Cited in
(30)- Projection under pairwise distance control
- Using Bayesian latent Gaussian graphical models to infer symptom associations in verbal autopsies
- A simple model-based approach to variable selection in classification and clustering
- rCOSA: a software package for clustering objects on subsets of attributes
- Robust variable selection for model-based learning in presence of adulteration
- A projection pursuit approach to variable selection.
- An ensemble feature ranking algorithm for clustering analysis
- Variable selection for mixed data clustering: application in human population genomics
- Testing equality of standardized generalized variances of \(k\) multivariate normal populations with arbitrary dimensions
- Model-based clustering
- Unsupervised classification with a family of parsimonious contaminated shifted asymmetric Laplace mixtures
- Piecewise regression mixture for simultaneous functional data clustering and optimal segmentation
- Sparse and geometry-aware generalisation of the mutual information for joint discriminative clustering and feature selection
- Significance analysis for pairwise variable selection in classification
- A mixture of generalized hyperbolic distributions
- A variable-selection heuristic for K-means clustering
- Variable selection methods for model-based clustering
- Modelling the role of variables in model-based cluster analysis
- Selection of Variables for Cluster Analysis and Classification Rules
- Variable Selection for Clustering with Gaussian Mixture Models
- Combining clustering of variables and feature selection using random forests
- A hierarchical Bayesian approach for examining heterogeneity in choice decisions
- Multivariate response and parsimony for Gaussian cluster-weighted models
- On the strong consistency of feature-weighted \(k\)-means clustering in a nearmetric space
- Supervised clustering of variables
- Discriminative variable selection for clustering with the sparse Fisher-EM algorithm
- Variable selection for skewed model-based clustering: application to the identification of novel sleep phenotypes
- Quantile autocovariances: a powerful tool for hard and soft partitional clustering of time series
- Simultaneous variable weighting and determining the number of clusters -- a weighted Gaussian means algorithm
- On fractionally-supervised classification: weight selection and extension to the multivariate \(t\)-distribution
This page was built for publication: Variable selection for clustering and classification
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q288977)