Variable selection for clustering and classification
From MaRDI portal
Publication:288977
DOI10.1007/S00357-013-9139-2zbMATH Open1360.62310arXiv1303.5294OpenAlexW2075711194MaRDI QIDQ288977FDOQ288977
Authors: Jeffrey L. Andrews, Paul D. McNicholas
Publication date: 27 May 2016
Published in: Journal of Classification (Search for Journal in Brave)
Abstract: As data sets continue to grow in size and complexity, effective and efficient techniques are needed to target important features in the variable space. Many of the variable selection techniques that are commonly used alongside clustering algorithms are based upon determining the best variable subspace according to model fitting in a stepwise manner. These techniques are often computationally intensive and can require extended periods of time to run; in fact, some are prohibitively computationally expensive for high-dimensional data. In this paper, a novel variable selection technique is introduced for use in clustering and classification analyses that is both intuitive and computationally efficient. We focus largely on applications in mixture model-based learning, but the technique could be adapted for use with various other clustering/classification methods. Our approach is illustrated on both simulated and real data, highlighted by contrasting its performance with that of other comparable variable selection techniques on the real data sets.
Full work available at URL: https://arxiv.org/abs/1303.5294
Recommendations
- Variable Selection for Model-Based Clustering
- Variable selection in model-based clustering and discriminant analysis with a regularization approach
- Selection of variables in cluster analysis: An empirical comparison of eight procedures
- Variable selection methods for model-based clustering
- Variable Selection for Model-Based High-Dimensional Clustering and Its Application to Microarray Data
variable selectionclassificationcluster analysishigh-dimensional datamixture modelsmodel-based clustering
Cites Work
- Statistical Analysis of Financial Data in S-Plus
- A Framework for Feature Selection in Clustering
- Estimating the dimension of a model
- Simultaneous model-based clustering and visualization in the Fisher discriminative subspace
- Generation of random clusters with specified degree of separation
- Variable Selection for Clustering with Gaussian Mixture Models
- Model-Based Clustering, Discriminant Analysis, and Density Estimation
- Variable Selection for Model-Based Clustering
- Model-based cluster and discriminant analysis with the MIXMOD software
- Dimensionally reduced model-based clustering through mixtures of factor mixture analyzers
- Bayes Factors
- Heteroscedastic factor mixture analysis
- Enhanced model-based clustering, density estimation, and discriminant analysis software:\newline MCLUST
- Extending mixtures of multivariate \(t\)-factor analyzers
- Mixtures of modified \(t\)-factor analyzers for model-based clustering, classification, and discriminant analysis
- Influence of lattice vibration on the ground state of magnetopolaron in a parabolic quantum dot
Cited In (26)
- On fractionally-supervised classification: weight selection and extension to the multivariate \(t\)-distribution
- Robust variable selection for model-based learning in presence of adulteration
- Model-based clustering
- Variable selection methods for model-based clustering
- Modelling the role of variables in model-based cluster analysis
- Multivariate response and parsimony for Gaussian cluster-weighted models
- Projection under pairwise distance control
- Significance analysis for pairwise variable selection in classification
- Variable Selection for Clustering with Gaussian Mixture Models
- Testing equality of standardized generalized variances of \(k\) multivariate normal populations with arbitrary dimensions
- Sparse and geometry-aware generalisation of the mutual information for joint discriminative clustering and feature selection
- Simultaneous variable weighting and determining the number of clusters -- a weighted Gaussian means algorithm
- Quantile autocovariances: a powerful tool for hard and soft partitional clustering of time series
- Using Bayesian latent Gaussian graphical models to infer symptom associations in verbal autopsies
- An ensemble feature ranking algorithm for clustering analysis
- Variable selection for mixed data clustering: application in human population genomics
- Selection of Variables for Cluster Analysis and Classification Rules
- A mixture of generalized hyperbolic distributions
- Piecewise regression mixture for simultaneous functional data clustering and optimal segmentation
- A hierarchical Bayesian approach for examining heterogeneity in choice decisions
- Variable Selection for Skewed Model-Based Clustering: Application to the Identification of Novel Sleep Phenotypes
- Supervised clustering of variables
- On the strong consistency of feature-weighted \(k\)-means clustering in a nearmetric space
- A projection pursuit approach to variable selection.
- rCOSA: a software package for clustering objects on subsets of attributes
- Unsupervised classification with a family of parsimonious contaminated shifted asymmetric Laplace mixtures
Uses Software
This page was built for publication: Variable selection for clustering and classification
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q288977)