Publication:159675: Difference between revisions
From MaRDI portal
Publication:159675
Created automatically from import240425040427 |
EloiFerrer (talk | contribs) m EloiFerrer moved page Bi-cross-validation of the SVD and the nonnegative matrix factorization to Bi-cross-validation of the SVD and the nonnegative matrix factorization: Duplicate |
(No difference)
|
Latest revision as of 11:07, 29 April 2024
DOI10.48550/ARXIV.0908.2062zbMath1166.62047arXiv0908.2062OpenAlexW3104577407MaRDI QIDQ159675
Patrick O. Perry, Art B. Owen, Patrick O. Perry, Art B. Owen
Publication date: 14 August 2009
Published in: The Annals of Applied Statistics (Search for Journal in Brave)
Abstract: This article presents a form of bi-cross-validation (BCV) for choosing the rank in outer product models, especially the singular value decomposition (SVD) and the nonnegative matrix factorization (NMF). Instead of leaving out a set of rows of the data matrix, we leave out a set of rows and a set of columns, and then predict the left out entries by low rank operations on the retained data. We prove a self-consistency result expressing the prediction error as a residual from a low rank approximation. Random matrix theory and some empirical results suggest that smaller hold-out sets lead to more over-fitting, while larger ones are more prone to under-fitting. In simulated examples we find that a method leaving out half the rows and half the columns performs well.
Full work available at URL: https://arxiv.org/abs/0908.2062
Multivariate distribution of statistics (62H10) Factor analysis and principal components; correspondence analysis (62H25) Asymptotic distribution theory in statistics (62E20) Random matrices (algebraic aspects) (15B52)
Cites Work
- The truncated SVD as a method for regularization
- Estimating the dimension of a model
- A note on universality of the distribution of the largest eigenvalues in certain sample covariance matrices
- Resampling and exchangeable arrays
- On mixed-type reverse-order laws for the Moore-Penrose inverse of a matrix product
- On the distribution of the largest eigenvalue in principal components analysis
- Principal component analysis.
- Generalized inverses. Theory and applications.
- The pigeonhole bootstrap
- Eigenvalues of large sample covariance matrices of spiked population models
- THE NATURE OF POWER CORRECTIONS IN LARGE-β0 APPROXIMATION
- Cross-Validatory Estimation of the Number of Components in Factor and Principal Components Models
- Model Averaging and Dimension Selection for the Singular Value Decomposition
- Inferential Theory for Factor Models of Large Dimensions
- Determining the Number of Factors in Approximate Factor Models
- A new look at the statistical model identification
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
Related Items (32)
Estimating and Accounting for Unobserved Covariates in High-Dimensional Correlated Data ⋮ Supervised singular value decomposition and its asymptotic properties ⋮ Biwhitening Reveals the Rank of a Count Matrix ⋮ Adaptive singular value shrinkage estimate for low rank tensor denoising ⋮ The cluster graphical Lasso for improved estimation of Gaussian graphical models ⋮ Adaptive shrinkage of singular values ⋮ Double-Matched Matrix Decomposition for Multi-View Data ⋮ Unnamed Item ⋮ \textit{ScreeNOT}: exact MSE-optimal singular value thresholding in correlated noise ⋮ Multiple hypothesis testing adjusted for latent variables, with an application to the AGEMAP gene expression data ⋮ Deviance matrix factorization ⋮ AUTOMATIC RELEVANCE DETERMINATION IN NONNEGATIVE MATRIX FACTORIZATION BASED ON A ZERO-INFLATED COMPOUND POISSON-GAMMA DISTRIBUTION ⋮ Fast Estimation of Approximate Matrix Ranks Using Spectral Densities ⋮ Rank Selection in Nonnegative Matrix Factorization using Minimum Description Length ⋮ Unsupervised dimensionality reduction versus supervised regularization for classification from sparse data ⋮ The effects of nonignorable missing data on label-free mass spectrometry proteomics experiments ⋮ Cross-Validation With Confidence ⋮ Unifying and Generalizing Methods for Removing Unwanted Variation Based on Negative Controls ⋮ Biclustering with heterogeneous variance ⋮ A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis ⋮ Bi-cross-validation for factor analysis ⋮ Hierarchical nuclear norm penalization for multi-view data integration ⋮ Bayesian simultaneous factorization and prediction using multi-omic data ⋮ A zero-inflated non-negative matrix factorization for the deconvolution of mixed signals of biological data ⋮ Robust singular value decomposition with application to video surveillance background modelling ⋮ Network Cross-Validation for Determining the Number of Communities in Network Data ⋮ A Generalized Least-Square Matrix Decomposition ⋮ Estimating the Number of Clusters Using Cross-Validation ⋮ bcv ⋮ To Wait or Not to Wait: Two-Way Functional Hazards Model for Understanding Waiting in Call Centers ⋮ Unnamed Item ⋮ Analysis of multiview legislative networks with structured matrix factorization: does Twitter influence translate to the real world?
This page was built for publication: Bi-cross-validation of the SVD and the nonnegative matrix factorization