Real world networks for network classification method evaluation

From MaRDI portal
Dataset:6701279



DOI10.5281/zenodo.7749514Zenodo7749514MaRDI QIDQ6701279FDOQ6701279

Dataset published at Zenodo repository.

Lucas C. Ribas, Odemir Martinez Bruno, Kallil Zielinski, Jeaneth MacHicao

Publication date: 18 March 2023

Copyright license: Creative Commons Attribution 4.0 International



This research presents a set of network databases based on real-world data, aimed at evaluating the effectiveness of network classification methods. The first database, called the Social database, consists of networks from the Stanford Network Analysis Project (SNAP) platform, including two classes, namely, Google+ and Twitter, we have made some preprocessment and also reduced the original dataset from the SNAP platform such thateach class contains50 networks. The other databases presented in this study are collectively known as the Metabolic database, as they are constructed using the substrate-product network model and are based on biochemical reactions of organisms obtained from the Kyoto Encyclopedia of Genes and Genomes database (KEGG). The networks were generated using a model that considers metabolites as vertices and chemical reactions as edges. The Metabolic database comprises six classification schemes, which include the kingdom-database with 160 network samples, where each of the four classes contains 40 networks representing animal, plant, fungi, and protist kingdoms. The remaining databases in the Metabolic database are the Animal-database, Fungi-database, Plant-database, Firmicutes-Bacilis-database, and Actinobacteria-database, each containing a varying number of network samples. To ensure consistency and comparability of our results, we implemented a standardization procedure whereby all networks were converted to adjacency list format. This allowed us to more efficiently process and analyze the data.







This page was built for dataset: Real world networks for network classification method evaluation