Permutation methods for factor analysis and PCA

DOI10.1214/19-AOS1907MaRDI QIDQ2215761zbMATH OpenOpenAlexFDO

Publication date 14 December 2020

Published in The Annals of Statistics (Search for Journal in Brave)

Full work available at URL https://arxiv.org/abs/1710.00479, https://projecteuclid.org/euclid.aos/1600480933

factor analysis parallel analysis high-dimensional asymptotics principal component analysis (PCA)permutation methods

Factor analysis and principal components; correspondence analysis (62H25) Estimation in multivariate analysis (62H12) Signal theory (characterization, reconstruction, filtering, etc.) (94A12) Signal detection and filtering (aspects of stochastic processes) (60G35)

Abstract: Researchers often have datasets measuring features

x_{i j}

of samples, such as test scores of students. In factor analysis and PCA, these features are thought to be influenced by unobserved factors, such as skills. Can we determine how many components affect the data? This is an important problem, because it has a large impact on all downstream data analysis. Consequently, many approaches have been developed to address it. Parallel Analysis is a popular permutation method. It works by randomly scrambling each feature of the data. It selects components if their singular values are larger than those of the permuted data. Despite widespread use in leading textbooks and scientific publications, as well as empirical evidence for its accuracy, it currently has no theoretical justification. In this paper, we show that the parallel analysis permutation method consistently selects the large components in certain high-dimensional factor models. However, it does not select the smaller components. The intuition is that permutations keep the noise invariant, while "destroying" the low-rank signal. This provides justification for permutation methods in PCA and factor models under some conditions. Our work uncovers drawbacks of permutation methods, and paves the way to improvements.

Recommendations

Cites work

Cited in

(15)

Describes a project that uses

Uses Software

This page was built for publication: Permutation methods for factor analysis and PCA

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2215761)