An Annotated Dataset of Stack Overflow Post Edits

From MaRDI portal
Dataset:6710015



DOI10.5281/zenodo.3938946Zenodo3938946MaRDI QIDQ6710015FDOQ6710015

Dataset published at Zenodo repository.

Sebastian Baltes, Markus Wagner

Publication date: 10 July 2020

Copyright license: Creative Commons Attribution-ShareAlike 4.0 International



To improve software engineering, software repositories have been mined for code snippets and bug fixes. Typically, this mining takes place at the level of files or commits. To be able to dig deeper and to extract insights at a higher resolution, we hereby present an annotated dataset that contains over 7 million edits of code and text on Stack Overflow. Our preliminary study indicates that these edits might be a treasure trove for mining information about fine-grained patches, e.g., for the optimisation of non-functional properties. EDIT: In the more recent version I fixedGetEditContent.sql, which had an ambiguous column name in one of the select statements.







This page was built for dataset: An Annotated Dataset of Stack Overflow Post Edits