With the proliferation in number and scale of online courses, several challenges have emerged in supporting stakeholders during their delivery and fruition. Machine Learning and Semantic Analysis can add value to the underlying online environments in order to overcome a subset of such challenges (e.g. classification, retrieval, and recommendation). However, conducting reproducible experiments in these applications is still an open problem due to the lack of available datasets in Technology-Enhanced Learning (TEL), mostly small and local. In this paper, we propose COCO, a novel semantic-enriched collection including over 43 K online courses at scale, 16 K instructors and 2,5 M learners who provided 4,5 M ratings and 1,2 M comments in total. This outruns existing TEL datasets in terms of scale, completeness, and comprehensiveness. Besides describing the collection procedure and the dataset structure, we depict and analyze two potential use cases as meaningful examples of the large variety of multi-disciplinary studies made possible by having COCO.

COCO: Semantic-enriched collection of online courses at scale with experimental use cases

Danilo Dessì;Gianni Fenu;Mirko Marras;Diego Reforgiato Recupero
2018-01-01

Abstract

With the proliferation in number and scale of online courses, several challenges have emerged in supporting stakeholders during their delivery and fruition. Machine Learning and Semantic Analysis can add value to the underlying online environments in order to overcome a subset of such challenges (e.g. classification, retrieval, and recommendation). However, conducting reproducible experiments in these applications is still an open problem due to the lack of available datasets in Technology-Enhanced Learning (TEL), mostly small and local. In this paper, we propose COCO, a novel semantic-enriched collection including over 43 K online courses at scale, 16 K instructors and 2,5 M learners who provided 4,5 M ratings and 1,2 M comments in total. This outruns existing TEL datasets in terms of scale, completeness, and comprehensiveness. Besides describing the collection procedure and the dataset structure, we depict and analyze two potential use cases as meaningful examples of the large variety of multi-disciplinary studies made possible by having COCO.
File in questo prodotto:
File Dimensione Formato  
coco-semantic-enriched.pdf

Solo gestori archivio

Tipologia: versione pre-print
Dimensione 513.88 kB
Formato Adobe PDF
513.88 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11584/254336
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 35
  • ???jsp.display-item.citation.isi??? 24
social impact