Optimal Bipartite Network Clustering
From MaRDI portal
Publication:4969080
zbMATH Open1498.68281arXiv1803.06031MaRDI QIDQ4969080FDOQ4969080
Publication date: 5 October 2020
Abstract: We study bipartite community detection in networks, or more generally the network biclustering problem. We present a fast two-stage procedure based on spectral initialization followed by the application of a pseudo-likelihood classifier twice. Under mild regularity conditions, we establish the weak consistency of the procedure (i.e., the convergence of the misclassification rate to zero) under a general bipartite stochastic block model. We show that the procedure is optimal in the sense that it achieves the optimal convergence rate that is achievable by a biclustering oracle, adaptively over the whole class, up to constants. This is further formalized by deriving a minimax lower bound over a class of biclustering problems. The optimal rate we obtain sharpens some of the existing results and generalizes others to a wide regime of average degree growth, from sparse networks with average degrees growing arbitrarily slowly to fairly dense networks with average degrees of order . As a special case, we recover the known exact recovery threshold in the regime of sparsity. To obtain the consistency result, as part of the provable version of the algorithm, we introduce a sub-block partitioning scheme that is also computationally attractive, allowing for distributed implementation of the algorithm without sacrificing optimality. The provable algorithm is derived from a general class of pseudo-likelihood biclustering algorithms that employ simple EM type updates. We show the effectiveness of this general class by numerical simulations.
Full work available at URL: https://arxiv.org/abs/1803.06031
stochastic block modelpseudo-likelihoodcommunity detectionnetwork analysisbiclusteringspectral clusteringbipartite networks
Classification and discrimination; cluster analysis (statistical aspects) (62H30) Learning and adaptive systems in artificial intelligence (68T05) Small world graphs, complex networks (graph-theoretic aspects) (05C82)
Cites Work
- Title not available (Why is that?)
- Title not available (Why is that?)
- A nonparametric view of network models and Newman–Girvan and other modularities
- Estimation and Prediction for Stochastic Blockstructures
- Elements of Information Theory
- Spectral clustering and the high-dimensional stochastic blockmodel
- Pseudo-likelihood methods for community detection in large sparse networks
- Consistency of community detection in networks under degree-corrected stochastic block models
- Consistency of maximum-likelihood and variational estimators in the stochastic block model
- Consistency of spectral clustering in stochastic block models
- Spectral redemption in clustering sparse networks
- Minimax rates of community detection in stochastic block models
- Asymptotic normality of maximum likelihood and its variational approximation for stochastic blockmodels
- Community Detection and Stochastic Block Models
- A Simple SVD Algorithm for Finding Hidden Partitions
- Community detection thresholds and the weak Ramanujan property
- Achieving Optimal Misclassification Proportion in Stochastic Block Model
- Spectral clustering in the dynamic stochastic block model
- The Poisson Approximation to the Poisson Binomial Distribution
- On semidefinite relaxations for the block model
- Community detection in degree-corrected block models
- Exact Recovery in the Stochastic Block Model
- Consistent Adjacency-Spectral Partitioning for the Stochastic Block Model When the Model Parameters Are Unknown
- Community detection in sparse networks via Grothendieck's inequality
- Consistency thresholds for the planted bisection model
- Achieving Exact Cluster Recovery Threshold via Semidefinite Programming: Extensions
- A Measure of Asymptotic Efficiency for Tests of a Hypothesis Based on the sum of Observations
- The tail of the hypergeometric distribution
- Asymptotic error probability of binary hypothesis testing for Poisson point-process observations (Corresp.)
- Random Laplacian matrices and convex relaxations
- Achieving Exact Cluster Recovery Threshold via Semidefinite Programming
- Phase transitions in semidefinite relaxations
- Semidefinite programs on sparse random graphs and their application to community detection
- A spectral method for community detection in moderately sparse degree-corrected stochastic block models
- Optimal estimation and completion of matrices with biclustering structures
- A simple proof of Stirling's formula for the gamma function
- Analysis of spectral clustering algorithms for community detection: the general bipartite setting
- Optimal rates for community estimation in the weighted stochastic block model
Cited In (7)
- A Time-Varying Network for Cryptocurrencies
- Identifiability and parameter estimation of the overlapped stochastic co-block model
- Subspace estimation from unbalanced and incomplete data matrices: \({\ell_{2,\infty}}\) statistical guarantees
- Rate optimal Chernoff bound and application to community detection in the stochastic block models
- Adjusted chi-square test for degree-corrected block models
- Heteroskedastic PCA: algorithm, optimality, and applications
- Analysis of spectral clustering algorithms for community detection: the general bipartite setting
This page was built for publication: Optimal Bipartite Network Clustering
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q4969080)