NoVAGraphS FSA User-Agent Corpus
DOI10.5281/zenodo.10822733Zenodo10822733MaRDI QIDQ6718990FDOQ6718990
Dataset published at Zenodo repository.
Pier Felice Balestrucci, Alessandro Mazzei, Cristian Bernareggi, Elisa di Nuovo, Manuela Sanguinetti, Luca Anselma
Publication date: 15 March 2024
Paper: Di Nuovo E., Sanguinetti M., Balestrucci P.F,Anselma L., Bernareggi C., Mazzei A. (2024),Educational Dialogue Systems for Visually Impaired Students: Introducing a Task-Oriented User-Agent Corpus. Accepted paper at the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024) Contact person: Elisa Di Nuovo, elisa.dinuovo@gmail.com Dataset Summary Collection of user-agent interactions revolving around the description of Finite State Automata. Daset Description The corpus consists of a CSV file encoded in UTF-8 comprising the following columns: CODE_ID: the id of the interaction Turn: the turn number within the interaction Participant: it identifies the sender (U for the user, S for the agent) Text: the utterance content VIP: it determines whether the user is a Visually-Impaired Person Token count: the number of tokens in the utterance (counted using Spacy tokenizer) DAs_GOLD and Errors_GOLD: the columns including the assigned labels for Dialog Acts and Errors, respectively FSA_ID: the id of the Finite State Automaton that is being referred to within the conversation (it corresponds to the PNG and HTML file names containing the relevant information on the FSA) Additional Data Two PNG files with the graphical representation of the automata Two HTML files containing the state tables of the automata RASA configuration files used to train the DIET classifier on the DAs Access Request To access the data users need to fill out the following Googleform
This page was built for dataset: NoVAGraphS FSA User-Agent Corpus