Statistical theory in clustering (Q1063979)

From MaRDI portal





scientific article; zbMATH DE number 3919579
Language Label Description Also known as
default for all languages
No label defined
    English
    Statistical theory in clustering
    scientific article; zbMATH DE number 3919579

      Statements

      Statistical theory in clustering (English)
      0 references
      0 references
      1985
      0 references
      A number of statistical models for forming and evaluating clusters are reviewed. Hierarchical algorithms are evaluated by their ability to discover high density regions in a population, and complete linkage hopelessly fails; the others don't do too well either. Single linkage is at least of mathematical interest because it is related to the minimum spanning tree and percolation. Mixture methods are examined, related to k-means, and the failure of likelihood tests for the number of components is noted. The DIP test for estimating the number of modes in a univariate population measures the distance between the empirical distribution function and the closest unimodal distribution function (or k-modal distribution function when testing for k modes). Its properties are examined and multivariate extensions are proposed. Ultrametric and evolutionary distances on trees are considered briefly.
      0 references
      clustering
      0 references
      tests of unimodality
      0 references
      Hierarchical algorithms
      0 references
      high density regions
      0 references
      complete linkage
      0 references
      Single linkage
      0 references
      minimum spanning tree
      0 references
      percolation
      0 references
      Mixture methods
      0 references
      likelihood tests
      0 references
      DIP test
      0 references
      empirical distribution
      0 references
      unimodal distribution
      0 references
      Ultrametric
      0 references
      evolutionary distances
      0 references

      Identifiers