Statistical theory in clustering (Q1063979)

From MaRDI portal
scientific article
Language Label Description Also known as
English
Statistical theory in clustering
scientific article

    Statements

    Statistical theory in clustering (English)
    0 references
    0 references
    0 references
    1985
    0 references
    A number of statistical models for forming and evaluating clusters are reviewed. Hierarchical algorithms are evaluated by their ability to discover high density regions in a population, and complete linkage hopelessly fails; the others don't do too well either. Single linkage is at least of mathematical interest because it is related to the minimum spanning tree and percolation. Mixture methods are examined, related to k-means, and the failure of likelihood tests for the number of components is noted. The DIP test for estimating the number of modes in a univariate population measures the distance between the empirical distribution function and the closest unimodal distribution function (or k-modal distribution function when testing for k modes). Its properties are examined and multivariate extensions are proposed. Ultrametric and evolutionary distances on trees are considered briefly.
    0 references
    0 references
    clustering
    0 references
    tests of unimodality
    0 references
    Hierarchical algorithms
    0 references
    high density regions
    0 references
    complete linkage
    0 references
    Single linkage
    0 references
    minimum spanning tree
    0 references
    percolation
    0 references
    Mixture methods
    0 references
    likelihood tests
    0 references
    DIP test
    0 references
    empirical distribution
    0 references
    unimodal distribution
    0 references
    Ultrametric
    0 references
    evolutionary distances
    0 references