Deep unsupervised feature selection by discarding nuisance and correlated features

DOI10.1016/J.NEUNET.2022.04.002MaRDI QIDQ6077005zbMATH OpenOpenAlexFDO

Authors Uri Shaham, Ofir Lindenbaum, Jonathan Svirsky, Yuval Kluger

Publication date 17 October 2023

Published in Neural Networks (Search for Journal in Brave)

Full work available at URL https://arxiv.org/abs/2110.05306, https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9526895

unsupervised feature selection Laplacian score concrete layer

Artificial neural networks and deep learning (68T07) Computational aspects of data analysis and big data (68T09) Metric embeddings as related to computational problems and algorithms (68R12)

Abstract: Modern datasets often contain large subsets of correlated features and nuisance features, which are not or loosely related to the main underlying structures of the data. Nuisance features can be identified using the Laplacian score criterion, which evaluates the importance of a given feature via its consistency with the Graph Laplacians' leading eigenvectors. We demonstrate that in the presence of large numbers of nuisance features, the Laplacian must be computed on the subset of selected features rather than on the complete feature set. To do this, we propose a fully differentiable approach for unsupervised feature selection, utilizing the Laplacian score criterion to avoid the selection of nuisance features. We employ an autoencoder architecture to cope with correlated features, trained to reconstruct the data from the subset of selected features. Building on the recently proposed concrete layer that allows controlling for the number of selected features via architectural design, simplifying the optimization process. Experimenting on several real-world datasets, we demonstrate that our proposed approach outperforms similar approaches designed to avoid only correlated or nuisance features, but not both. Several state-of-the-art clustering results are reported.

Recommendations

Cites work

Cited in

(1)

Robust autoencoder feature selector for unsupervised feature selection

This page was built for publication: Deep unsupervised feature selection by discarding nuisance and correlated features

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6077005)