How Many Clusters? An Information-Theoretic Perspective

DOI10.1162/0899766042321751zbMATH Open1062.82045DBLPjournals/neco/StillB04arXivphysics/0303011OpenAlexW2168614937WikidataQ52969276 ScholiaQ52969276MaRDI QIDQ4664508FDOQ4664508

Authors: Susanne Still, William Bialek

Publication date: 5 April 2005

Published in: Neural Computation (Search for Journal in Brave)

Abstract: Clustering provides a common means of identifying structure in complex data, and there is renewed interest in clustering as a tool for the analysis of large data sets in many fields. A natural question is how many clusters are appropriate for the description of a given system. Traditional approaches to this problem are based either on a framework in which clusters of a particular shape are assumed as a model of the system or on a two-step procedure in which a clustering criterion determines the optimal assignments for a given number of clusters and a separate criterion measures the goodness of the classification to determine the number of clusters. In a statistical mechanics approach, clustering can be seen as a trade--off between energy-- and entropy--like terms, with lower temperature driving the proliferation of clusters to provide a more detailed description of the data. For finite data sets, we expect that there is a limit to the meaningful structure that can be resolved and therefore a minimum temperature beyond which we will capture sampling noise. This suggests that correcting the clustering criterion for the bias which arises due to sampling errors will allow us to find a clustering solution at a temperature which is optimal in the sense that we capture maximal meaningful structure -- without having to define an external criterion for the goodness or stability of the clustering. We show that, in a general information theoretic framework, the finite size of a data set determines an optimal temperature, and we introduce a method for finding the maximal number of clusters which can be resolved from the data in the hard clustering limit.

Full work available at URL: https://arxiv.org/abs/physics/0303011

Recommendations

zbMATH Keywords

clustering

Mathematics Subject Classification ID

Pattern recognition, speech recognition (68T10) Information theory (general) (94A15) Neural nets applied to problems in time-dependent statistical mechanics (82C32)

Cites Work

Cited In (12)

This page was built for publication: How Many Clusters? An Information-Theoretic Perspective

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q4664508)