Today's cat is tomorrow's dog: accounting for time-based changes in the labels of ML vulnerability detection approaches (Replication Package Part 2: Linux dataset)
DOI10.5281/zenodo.14803487Zenodo14803487MaRDI QIDQ6717902FDOQ6717902
Dataset published at Zenodo repository.
Fabio Massacci, Ranindya Paramitha, Yuan Feng, Anonymous
Publication date: 11 February 2025
Copyright license: Creative Commons Attribution 4.0 International
The Replication Package of "Today's cat is tomorrow's dog: accounting for time-based changes in the labels of ML vulnerability detection approaches" Part 2 (LINUX Dataset) This repository includes: Code that contains the codes to replicate some parts of this study:a.1_generate_datasets implements our methodology to generate the datasets.b.2_run_models runs the ML models during the evaluation.c.3_result_replication generates charts presented in the paper from the ML evaluation results. Datasets that contain 2 folders:a.original datasets: 1 from NVD Vuldeepecker and 3 extracted fromBigVul. b. LINUX datasets: train, validation, test sets for each time of observation extracted using our methodology from BigVuldataset for projectlinux. Pretrained-models that we generated during our evaluation (3 test results for each time point in the timeline [2011-2019]). Results of our evaluation, the folder ALL contains the overall results and other folders are results by model. Please refer to the following repositories for the other datasets and pre-trained models: - Part 1 NVD Vuldeeepecker : https://doi.org/10.5281/zenodo.8207883 - Part 3 OPENSSL : https://doi.org/10.5281/zenodo.10966117 - Part 4 POPPLER : https://doi.org/10.5281/zenodo.14713143
This page was built for dataset: Today's cat is tomorrow's dog: accounting for time-based changes in the labels of ML vulnerability detection approaches (Replication Package Part 2: Linux dataset)