A distributed community detection algorithm for large scale networks under stochastic block models
From MaRDI portal
Publication:6096648
Abstract: With rapid developments of information and technology, large scale network data are ubiquitous. In this work we develop a distributed spectral clustering algorithm for community detection in large scale networks. To handle the problem, we distribute l pilot network nodes on the master server and the others on worker servers. A spectral clustering algorithm is first conducted on the master to select pseudo centers. The indexes of the pseudo centers are then broadcasted to workers to complete distributed community detection task using a SVD type algorithm. The proposed distributed algorithm has three merits. First, the communication cost is low since only the indexes of pseudo centers are communicated. Second, no further iteration algorithm is needed on workers and hence it does not suffer from problems as initialization and non-robustness. Third, both the computational complexity and the storage requirements are much lower compared to using the whole adjacency matrix. A Python package DCD (www.github.com/Ikerlz/dcd) is developed to implement the distributed algorithm for a Spark system. Theoretical properties are provided with respect to the estimation accuracy and mis-clustering rates. Lastly, the advantages of the proposed methodology are illustrated by experiments on a variety of synthetic and empirical datasets.
Recommendations
- A spectral method for community detection in moderately sparse degree-corrected stochastic block models
- Subsampling spectral clustering for stochastic block models in large-scale networks
- Pseudo-likelihood methods for community detection in large sparse networks
- scientific article; zbMATH DE number 7307464
- Find Your Place: Simple Distributed Algorithms for Community Detection
Cites work
- A nonparametric view of network models and Newman–Girvan and other modularities
- A tensor approach to learning mixed membership community models
- Co-clustering directed graphs to discover asymmetries and directional communities
- Communication-efficient algorithms for statistical optimization
- Community detection and stochastic block models: recent developments
- Consistency of community detection in networks under degree-corrected stochastic block models
- Consistency of spectral clustering in stochastic block models
- Distributed estimation of principal eigenspaces
- Distributed one-step upgraded estimation for non-uniformly and non-randomly distributed data
- Distributed semi-supervised learning with kernel ridge regression
- Distributed testing and estimation under sparse high dimensional models
- Divide and conquer local average regression
- Entrywise eigenvector analysis of random matrices with low expected rank
- Fast community detection by SCORE
- Multivariate spatial autoregressive model for large scale social networks
- Numerical methods for large eigenvalue problems
- Peer effects in bedtime decisions among adolescents: a social network model with sampled data
- Pseudo-likelihood methods for community detection in large sparse networks
- Role of normalization in spectral clustering for stochastic blockmodels
- Scalable and Robust Community Detection with Randomized Sketching
- Spectral clustering and the high-dimensional stochastic blockmodel
- TENET: tail-event driven network risk
Cited in
(4)
This page was built for publication: A distributed community detection algorithm for large scale networks under stochastic block models
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6096648)