This paper describes a corpus consisting of real-world dialogues in English between users and a task-oriented conversational agent, with interactions revolving around the description of finite state automata. The creation of this corpus is part of a larger research project aimed at developing tools for an easier access to educational content, especially in STEM fields, for users with visual impairments. The development of this corpus was precisely motivated by the aim of providing a useful resource to support the design of such tools. The core feature of this corpus is that its creation involved both sighted and visually impaired participants, thus allowing for a greater diversity of perspectives and giving the opportunity to identify possible differences in the way the two groups of participants interacted with the agent. The paper introduces this corpus, giving an account of the process that led to its creation, i.e. the methodology followed to obtain the data, the annotation scheme adopted, and the analysis of the results. Finally, the paper reports the results of a classification experiment on the annotated corpus, and an additional experiment to assess the annotation capabilities of three large language models, in view of a further expansion of the corpus. The corpus is released under the Creative Commons Attribution Non Commercial 4.0 International license and available, only for research purposes, at: https://zenodo.org/records/10822733.

Educational Dialogue Systems for Visually Impaired Students: Introducing a Task-Oriented User-Agent Corpus

Sanguinetti M.;
2024-01-01

Abstract

This paper describes a corpus consisting of real-world dialogues in English between users and a task-oriented conversational agent, with interactions revolving around the description of finite state automata. The creation of this corpus is part of a larger research project aimed at developing tools for an easier access to educational content, especially in STEM fields, for users with visual impairments. The development of this corpus was precisely motivated by the aim of providing a useful resource to support the design of such tools. The core feature of this corpus is that its creation involved both sighted and visually impaired participants, thus allowing for a greater diversity of perspectives and giving the opportunity to identify possible differences in the way the two groups of participants interacted with the agent. The paper introduces this corpus, giving an account of the process that led to its creation, i.e. the methodology followed to obtain the data, the annotation scheme adopted, and the analysis of the results. Finally, the paper reports the results of a classification experiment on the annotated corpus, and an additional experiment to assess the annotation capabilities of three large language models, in view of a further expansion of the corpus. The corpus is released under the Creative Commons Attribution Non Commercial 4.0 International license and available, only for research purposes, at: https://zenodo.org/records/10822733.
2024
accessibility; conversational agents; education; visual impairment
File in questo prodotto:
File Dimensione Formato  
2024.lrec-main.489.pdf

accesso aperto

Tipologia: versione editoriale (VoR)
Dimensione 477.79 kB
Formato Adobe PDF
477.79 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11584/417123
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? ND
social impact