Publication:159675: Difference between revisions

From MaRDI portal
Publication:159675
Created automatically from import240425040427
 
 
(No difference)

Latest revision as of 11:07, 29 April 2024

DOI10.48550/ARXIV.0908.2062zbMath1166.62047arXiv0908.2062OpenAlexW3104577407MaRDI QIDQ159675

Patrick O. Perry, Art B. Owen, Patrick O. Perry, Art B. Owen

Publication date: 14 August 2009

Published in: The Annals of Applied Statistics (Search for Journal in Brave)

Abstract: This article presents a form of bi-cross-validation (BCV) for choosing the rank in outer product models, especially the singular value decomposition (SVD) and the nonnegative matrix factorization (NMF). Instead of leaving out a set of rows of the data matrix, we leave out a set of rows and a set of columns, and then predict the left out entries by low rank operations on the retained data. We prove a self-consistency result expressing the prediction error as a residual from a low rank approximation. Random matrix theory and some empirical results suggest that smaller hold-out sets lead to more over-fitting, while larger ones are more prone to under-fitting. In simulated examples we find that a method leaving out half the rows and half the columns performs well.


Full work available at URL: https://arxiv.org/abs/0908.2062





Cites Work


Related Items (32)

Estimating and Accounting for Unobserved Covariates in High-Dimensional Correlated DataSupervised singular value decomposition and its asymptotic propertiesBiwhitening Reveals the Rank of a Count MatrixAdaptive singular value shrinkage estimate for low rank tensor denoisingThe cluster graphical Lasso for improved estimation of Gaussian graphical modelsAdaptive shrinkage of singular valuesDouble-Matched Matrix Decomposition for Multi-View DataUnnamed Item\textit{ScreeNOT}: exact MSE-optimal singular value thresholding in correlated noiseMultiple hypothesis testing adjusted for latent variables, with an application to the AGEMAP gene expression dataDeviance matrix factorizationAUTOMATIC RELEVANCE DETERMINATION IN NONNEGATIVE MATRIX FACTORIZATION BASED ON A ZERO-INFLATED COMPOUND POISSON-GAMMA DISTRIBUTIONFast Estimation of Approximate Matrix Ranks Using Spectral DensitiesRank Selection in Nonnegative Matrix Factorization using Minimum Description LengthUnsupervised dimensionality reduction versus supervised regularization for classification from sparse dataThe effects of nonignorable missing data on label-free mass spectrometry proteomics experimentsCross-Validation With ConfidenceUnifying and Generalizing Methods for Removing Unwanted Variation Based on Negative ControlsBiclustering with heterogeneous varianceA penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysisBi-cross-validation for factor analysisHierarchical nuclear norm penalization for multi-view data integrationBayesian simultaneous factorization and prediction using multi-omic dataA zero-inflated non-negative matrix factorization for the deconvolution of mixed signals of biological dataRobust singular value decomposition with application to video surveillance background modellingNetwork Cross-Validation for Determining the Number of Communities in Network DataA Generalized Least-Square Matrix DecompositionEstimating the Number of Clusters Using Cross-ValidationbcvTo Wait or Not to Wait: Two-Way Functional Hazards Model for Understanding Waiting in Call CentersUnnamed ItemAnalysis of multiview legislative networks with structured matrix factorization: does Twitter influence translate to the real world?





This page was built for publication: Bi-cross-validation of the SVD and the nonnegative matrix factorization