Model-based clustering of multiple networks with a hierarchical algorithm

DOI10.48550/arXiv.2211.02314zbMath1529.62033DBLPjournals/sac/Rebafka24arXiv2211.02314OpenAlexW4388464963WikidataQ131285860 ScholiaQ131285860MaRDI QIDQ57414

Tabea Rebafka

Publication date: 4 November 2022

Published in: Statistics and Computing (Search for Journal in Brave)

Full work available at URL: https://arxiv.org/abs/2211.02314

zbMATH Keywords

stochastic block model graph clustering integrated classification likelihood multiple networks agglomerative algorithm graphon distance

Mathematics Subject Classification ID

Computational methods for problems pertaining to statistics (62-08) Classification and discrimination; cluster analysis (statistical aspects) (62H30)

Related Items (1)

graphclust

Cites Work

Summary: This paper introduces a hierarchical algorithm for clustering multiple networks, even when these networks vary in size and do not share the same vertices. The method uses a statistical model-based approach, leveraging stochastic block models (SBMs) to group networks with similar topological structures. Clustering is achieved by maximizing the integrated classification likelihood (ICL) criterion, with an automated selection of the optimal number of clusters. A novel technique is presented to address label-switching issues in SBMs by comparing graphons, enabling accurate aggregation of clusters. The method is evaluated on synthetic data and applied to ecological food web networks, demonstrating its efficiency, interpretability, and robustness compared to existing graph clustering approaches.

Summary_simple: This paper explains a way to group networks, like maps of connections between people or animals, based on how their structure is similar. It uses a smart math-based method called stochastic block models (SBMs) to figure out these groups automatically. The process builds a tree-like diagram (dendrogram) to show how the networks are connected and picks the best number of groups without guessing. A special trick compares parts of the networks to make sure the grouping is accurate, even if the networks are labeled differently. This method was tested on fake data and real examples, like food chains in nature, and worked better than older techniques.

This page was built for publication: Model-based clustering of multiple networks with a hierarchical algorithm