Kernel spectral clustering of large dimensional data

From MaRDI portal
Publication:302428

DOI10.1214/16-EJS1144zbMATH Open1398.62160arXiv1510.03547OpenAlexW2963144092MaRDI QIDQ302428FDOQ302428


Authors: R. Couillet, Florent Benaych-Georges Edit this on Wikidata


Publication date: 5 July 2016

Published in: Electronic Journal of Statistics (Search for Journal in Brave)

Abstract: This article proposes a first analysis of kernel spectral clustering methods in the regime where the dimension p of the data vectors to be clustered and their number n grow large at the same rate. We demonstrate, under a k-class Gaussian mixture model, that the normalized Laplacian matrix associated with the kernel matrix asymptotically behaves similar to a so-called spiked random matrix. Some of the isolated eigenvalue-eigenvector pairs in this model are shown to carry the clustering information upon a separability condition classical in spiked matrix models. We evaluate precisely the position of these eigenvalues and the content of the eigenvectors, which unveil important (sometimes quite disruptive) aspects of kernel spectral clustering both from a theoretical and practical standpoints. Our results are then compared to the actual clustering performance of images from the MNIST database, thereby revealing an important match between theory and practice.


Full work available at URL: https://arxiv.org/abs/1510.03547




Recommendations




Cites Work


Cited In (27)

Uses Software





This page was built for publication: Kernel spectral clustering of large dimensional data

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q302428)