Optimal bipartite network clustering
From MaRDI portal
Publication:4969080
Abstract: We study bipartite community detection in networks, or more generally the network biclustering problem. We present a fast two-stage procedure based on spectral initialization followed by the application of a pseudo-likelihood classifier twice. Under mild regularity conditions, we establish the weak consistency of the procedure (i.e., the convergence of the misclassification rate to zero) under a general bipartite stochastic block model. We show that the procedure is optimal in the sense that it achieves the optimal convergence rate that is achievable by a biclustering oracle, adaptively over the whole class, up to constants. This is further formalized by deriving a minimax lower bound over a class of biclustering problems. The optimal rate we obtain sharpens some of the existing results and generalizes others to a wide regime of average degree growth, from sparse networks with average degrees growing arbitrarily slowly to fairly dense networks with average degrees of order . As a special case, we recover the known exact recovery threshold in the regime of sparsity. To obtain the consistency result, as part of the provable version of the algorithm, we introduce a sub-block partitioning scheme that is also computationally attractive, allowing for distributed implementation of the algorithm without sacrificing optimality. The provable algorithm is derived from a general class of pseudo-likelihood biclustering algorithms that employ simple EM type updates. We show the effectiveness of this general class by numerical simulations.
Recommendations
- Analysis of spectral clustering algorithms for community detection: the general bipartite setting
- Bipartite communities via spectral partitioning
- Matched bipartite block model with covariates
- Bipartite communities via spectral partitioning
- Achieving optimal misclassification proportion in stochastic block models
Cites work
- scientific article; zbMATH DE number 6118218 (Why is no real title available?)
- A Measure of Asymptotic Efficiency for Tests of a Hypothesis Based on the sum of Observations
- A Simple SVD Algorithm for Finding Hidden Partitions
- A nonparametric view of network models and Newman–Girvan and other modularities
- A simple proof of Stirling's formula for the gamma function
- A spectral method for community detection in moderately sparse degree-corrected stochastic block models
- Achieving Exact Cluster Recovery Threshold via Semidefinite Programming
- Achieving Exact Cluster Recovery Threshold via Semidefinite Programming: Extensions
- Achieving optimal misclassification proportion in stochastic block models
- Analysis of spectral clustering algorithms for community detection: the general bipartite setting
- Asymptotic error probability of binary hypothesis testing for Poisson point-process observations (Corresp.)
- Asymptotic normality of maximum likelihood and its variational approximation for stochastic blockmodels
- Community detection and stochastic block models: recent developments
- Community detection in degree-corrected block models
- Community detection in sparse networks via Grothendieck's inequality
- Community detection thresholds and the weak Ramanujan property
- Consistency of community detection in networks under degree-corrected stochastic block models
- Consistency of maximum-likelihood and variational estimators in the stochastic block model
- Consistency of spectral clustering in stochastic block models
- Consistent adjacency-spectral partitioning for the stochastic block model when the model parameters are unknown
- Elements of Information Theory
- Estimation and Prediction for Stochastic Blockstructures
- Exact Recovery in the Stochastic Block Model
- Matched bipartite block model with covariates
- Minimax rates of community detection in stochastic block models
- On semidefinite relaxations for the block model
- Optimal estimation and completion of matrices with biclustering structures
- Optimal rates for community estimation in the weighted stochastic block model
- Phase transitions in semidefinite relaxations
- Pseudo-likelihood methods for community detection in large sparse networks
- Random Laplacian matrices and convex relaxations
- Semidefinite programs on sparse random graphs and their application to community detection
- Spectral clustering and the high-dimensional stochastic blockmodel
- Spectral clustering in the dynamic stochastic block model
- Spectral redemption in clustering sparse networks
- The Poisson Approximation to the Poisson Binomial Distribution
- The tail of the hypergeometric distribution
Cited in
(8)- Matched bipartite block model with covariates
- A Time-Varying Network for Cryptocurrencies
- Identifiability and parameter estimation of the overlapped stochastic co-block model
- Subspace estimation from unbalanced and incomplete data matrices: \({\ell_{2,\infty}}\) statistical guarantees
- Adjusted chi-square test for degree-corrected block models
- Rate optimal Chernoff bound and application to community detection in the stochastic block models
- Heteroskedastic PCA: algorithm, optimality, and applications
- Analysis of spectral clustering algorithms for community detection: the general bipartite setting
This page was built for publication: Optimal bipartite network clustering
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q4969080)