Statistical theory in clustering (Q1063979)
From MaRDI portal
scientific article
Language | Label | Description | Also known as |
---|---|---|---|
English | Statistical theory in clustering |
scientific article |
Statements
Statistical theory in clustering (English)
0 references
1985
0 references
A number of statistical models for forming and evaluating clusters are reviewed. Hierarchical algorithms are evaluated by their ability to discover high density regions in a population, and complete linkage hopelessly fails; the others don't do too well either. Single linkage is at least of mathematical interest because it is related to the minimum spanning tree and percolation. Mixture methods are examined, related to k-means, and the failure of likelihood tests for the number of components is noted. The DIP test for estimating the number of modes in a univariate population measures the distance between the empirical distribution function and the closest unimodal distribution function (or k-modal distribution function when testing for k modes). Its properties are examined and multivariate extensions are proposed. Ultrametric and evolutionary distances on trees are considered briefly.
0 references
clustering
0 references
tests of unimodality
0 references
Hierarchical algorithms
0 references
high density regions
0 references
complete linkage
0 references
Single linkage
0 references
minimum spanning tree
0 references
percolation
0 references
Mixture methods
0 references
likelihood tests
0 references
DIP test
0 references
empirical distribution
0 references
unimodal distribution
0 references
Ultrametric
0 references
evolutionary distances
0 references