Automated Generation of Code Contracts - Generative AI to the Rescue?

DOI10.5281/zenodo.13351004Zenodo13351004MaRDI QIDQ6706133FDOQ6706133

Dataset published at Zenodo repository.

Oscar Nierstrasz, Christos Tsigkanos, Timo Kehrer, Noah Bühlmann, Manuel Ohrndorf, Sandra Greiner

Publication date: 20 August 2024

Copyright license: Creative Commons Attribution 4.0 International

This replication package provides the setup and results to generate OpenJML code contracts for Java source code by fine-tuning and employing the resulting CodeT5 and CodeT5+ transformer models. Our code contract generation setup involved the training of the AI models and application. Furthermore, we analyzed the generated annotations wrt. thier logical validity and the type of OpenJML compilation errors. Both methods, together with the results are similarly provided. Source Code Repository (see also scripts-sources.tar): https://github.com/SEG-UNIBE/auto-generated-code-contracts https://zenodo.org/doi/10.5281/zenodo.13356451 Replication Package: contains the following [folders] Scripts: [scripts-sources.tar]: source codes of the following scripts Python scripts that we used for training and adding the OpenJML code contracts to the Java methods automated analyses of the studied source code classes and the type of compilation errors Sourcegraph Search Results: [sourcegraph-results.tar]: the results of the Sourcegraph search queries Datasets: [dataset.tar]: the dataset including the weka-project which contributes two-thirds of the contracts [dataset-withoutweka.tar]: the dataset without weka, which is significantly smaller and was used to examine the performance bias when training and testing without weka CodeT5 Models: [codet5-contracts.tar]:the best performing CodeT5 model which was fine-tuned to create OpenJML annotations for methods [codet5p-contracts.tar]: the best performing CodeT5+ model which was fine-tuned to create OpenJML annotations for methods [codet5p-contracts-withoutweka.tar]:the CodeT5+ model which was trained without weka on the same task Analysis Results: [analysis-results.tar/compilability-analysis]: the results of the compilability analysis the subjects to which we applied the best performing CodeT5+ the compilation results and their analysis [analysis-results.tar/logical-analysis] the results of the logical analysis the analysis of logic validity of SimpleStack and SimpleTicTacToe

This page was built for dataset: Automated Generation of Code Contracts - Generative AI to the Rescue?