Cross-ecosystem categorization: A manual-curation protocol for the categorization of Java Maven libraries along Python PyPI Topics (dataset)
DOI10.5281/zenodo.10480832Zenodo10480832MaRDI QIDQ6717882FDOQ6717882
Dataset published at Zenodo repository.
Carlos E. Budde, Yuan Feng, Ranindya Paramitha, Fabio Massacci
Publication date: 10 January 2024
Copyright license: Creative Commons Attribution 4.0 International
This dataset reports all information needed to implement a human-guided protocol for the categorisation of libraries, from any software ecosystem, along the 24 top-level PyPI Topic classifiers. It also contains the data produced in a demonstration, where the protocol was applied to 256 open-source Java libraries from Maven Central with high- or critical-severity CVEs. This dataset can be used as ground truth for cross-ecosystem studies in software engineering, especially from functional and security perspectives. This dataset contains: the protocol designed to interpret sources for category assessment, and arbitrate the results; the sources and metadata, including CVEs, collected for the demonstration; the set of categorised libraries and CVE statistics, including a higher-level classification into Local or Remote network functionalities.
This page was built for dataset: Cross-ecosystem categorization: A manual-curation protocol for the categorization of Java Maven libraries along Python PyPI Topics (dataset)