SelfCode 2.0: Annotated Corpus of Student Self-Explanations to Introductory JAVA Programs in Computer Science
DOI10.5281/zenodo.10912669Zenodo10912669MaRDI QIDQ6725598FDOQ6725598
Dataset published at Zenodo repository.
Jeevan Chapagain, Arun-Balajiee Lekshmi-Narayanan, Vasile Rus, Peter Brusilovsky
Publication date: 18 December 2023
Copyright license: Creative Commons Attribution 4.0 International
Dataset Description: This dataset was collected during a lab study conducted in Spring 2022 for introductory JAVA programming. Students had to provide line by explanations to four JAVA programs in the experimental condition of the study. The JAVA Programs were selected from the examples made available in the PCEX Worked Examples interface. The explanations collected were then split by the number of attempts. Students could attempt twice based on the feedback provided using the the PCEX interface and in their third attempt they filled in the blanks to complete an explanation to the particular line of code. In this dataset, we only have the annotated examples of explanations provided by students. The explanations were annotated on their correctness (binary rating 0 or 1), completeness (binary rating 0 or 1) and similarity (rating scale 1 to 5). Correctness: Given the line of code and context of the line in the program, if the student explanation covers only the topics relevant to the line of code Completeness: Given the line of code and context of the line in the program, if the student explanation covers all the topics relevant to the line of code Similarity: Given the line of code, the context of the line in the program and an expert explanation to the line of code, the metric compares the similarity on a rating scale from 1 to 5, defined in the following manner: 1 - expert and student explanations are very different, 2 -- expert and student explanations are somewhat alike, but there are major differences in the concepts / topics explained 3 -- expert and student explanations are similar but there are differences in the concepts / topics explained 4 -- expert and student explanations are similar and have few differences in the concepts / topics explained 5 -- expert and student explanations are very similar. Overall 3000 single attempts (corresponding to 40 student explanation submission) were annotated against different various expert explanation pairs. Dataset Summary: Explanation Type N DefinitionExperts 2 Source Code Line-by-Line Explanations by ExpertsStudents 60 (annotated 40) Source Code Line-by-Line Explanations by Students COUNT of std_sent_count std_sent_count 1 2 3 4 5 6 Grand Total 1 1854 367 245 107 34 33 2640 2 222 46 40 12 6 6 332 3 21 5 5 5 1 2 39 4 2 1 2 3 8 Grand Total 2099 419 292 127 41 41 3019 Sample Data: Program: PointTester; Line number: 12; Line code: private int y;Expert1: Every object of the Point class will have its own y-coordinate. Therefore, weneed to declare an instance variable for the class to store the y-coordinate of the point.We declare it as int because we want to have integer coordinates for the point. Notethat an instance variable is a variable defined in a class, for which each instantiatedobject of the class has a separate copy, or instance.Expert2: The instance variables are declared as private to prevent direct access tothem from outside the class. In this way, no unexpected modifications to a Pointobjects data are possible.Student1: initialize a private value inside the point class with no value yetStudent2: Declares the private int variable y.Student3: Creates a private int that can only be accessed by class Point called int y...Student59: private variable used to store the value entered into the value of the ycoordinate Kappa Scores: Round Row Numbers Correctness Rating Agreement %age Correctness Rating Kappa Sufficiency Rating Agreement %age Sufficiency Rating Kappa 1 1000 - 1432 92.9 0.365 0.708 -0.0123 2 1432 - 1864 94.2 0.263 77.6 0.329 3 1864 1964 75.3 0 70.3 0.299 4 1964 -- 2064 86 0.108 74.7 0.275 5 2064 2264 95.5 -0.0158 81.5 0.312 6 2264 2464 83.5 0.039 86.5 0.648 7 2464 2864 92 0.103 74.5 0.188 8 2864 -- 3005 86.5 -0.026 72.3 0.117 Citation Format:If using this dataset in your project please cite: Lekshmi-Narayanan, A.-B., Chapagain, J., Brusilovsky, P., Rus, V. (2023). SelfCode 2.0: Annotated Corpus of Student Self-Explanations to Introductory JAVA Programs in Computer Science [Data set]. Zenodo. https://doi.org/10.5281/zenodo.10912669 Acknowledgements:This project was funded as a part of the NSF AWARD # 1822752
This page was built for dataset: SelfCode 2.0: Annotated Corpus of Student Self-Explanations to Introductory JAVA Programs in Computer Science