Clustering by principal component analysis with Gaussian kernel in high-dimension, low-sample-size settings (Q2048123)
From MaRDI portal
scientific article
Language | Label | Description | Also known as |
---|---|---|---|
English | Clustering by principal component analysis with Gaussian kernel in high-dimension, low-sample-size settings |
scientific article |
Statements
Clustering by principal component analysis with Gaussian kernel in high-dimension, low-sample-size settings (English)
0 references
5 August 2021
0 references
The authors considered a clustering method based on the kernel principal component analysis (KPCA) for high-dimension, low-sample-size (HDLSS) data. First, they investigated asymptotic properties of the KPCA with the linear and Gaussian kernels for the two-class \((k = 2)\) model. Their results seems to extend an important results given by \textit{K. Yata} and \textit{M. Aoshima} [Scand. J. Stat. 47, No. 3, 899--921 (2020; Zbl 1454.62188)]. Second, they showed that HDLSS data can be classified by the sign of the first PC (principal component) scores.They gave theoretical reasons why the Gaussian kernel is effective for clustering high-dimensional data. Third, they discussed the choice of the scale parameter, \(\gamma\), to enjoy high performances of the KPCA with the Gaussian kernel. Then, they showed that the Gaussian kernel with the \(\gamma\) gives preferable performances both in numerical simulations and actual data analyses (they use three microarray data sets given in the supplemental material of \textit{M. Mramor} et al. [``Visualization-based cancer microarray data classification analysis'', Bioinformatics 23, 2147--2154 (2007)].
0 references
HDLSS
0 references
nonlinear PCA
0 references
PC score
0 references
radial basis function kernel
0 references
spherical data
0 references
0 references
0 references
0 references
0 references
0 references
0 references