Asymptotic properties of univariate sample k-means clusters (Q1062391): Difference between revisions
From MaRDI portal
Changed an Item |
Set profile property. |
||
Property / MaRDI profile type | |||
Property / MaRDI profile type: MaRDI publication profile / rank | |||
Normal rank |
Revision as of 03:05, 5 March 2024
scientific article
Language | Label | Description | Also known as |
---|---|---|---|
English | Asymptotic properties of univariate sample k-means clusters |
scientific article |
Statements
Asymptotic properties of univariate sample k-means clusters (English)
0 references
1984
0 references
The problem of optimal partition of a univariate random sample of size N on [a,b] into k clusters is considered for the case as k approaches infinity with N (such that the length of each cluster interval approaches zero while the number of observations in each cluster approaches infinity). This k-means clustering method minimizes the within-cluster sums of squares, and it can potentially be used for constructing variable-cell histograms. It is shown that the sampling locally optimal partition approaches the population optimal partition under certain regularity conditions (the population density is positive and has four bounded derivatives in [a,b]). Some large sample properties of this method are obtained: the sample k- means clusters are such that the within-cluster sums of squares are asymptotically equal, and the sizes of the cluster intervals are inversely proportional to the one-third power of the underlying density at the midpoints of the intervals. The multivariate case requires further investigation as the generalization of univariate results to many dimensions is not straightforward.
0 references
cluster lengths
0 references
non-standard asymptotics
0 references
optimal partition
0 references
univariate random sample
0 references
k-means clustering method
0 references
within-cluster sums of squares
0 references
variable-cell histograms
0 references
large sample properties
0 references