Child voices in Text-To-Speech (TTS) are essential for enabling children with speech or communication difficulties to express themselves authentically, supporting their social inclusion and sense of identity. However, the development and availability of high-quality child voices remain limited compared to adult voices, mostly because of the lack of child data and ethical reasons considering that the voice is owned by a minor. In this paper we review the current state of the TTS synthesis technological landscape with special focus on systems that are offering child voices, as well as technology that is targeting offline use on low resource mobile devices, such as smartphones and tablets. We also explore research efforts in creating child TTS by reviewing papers on neural child TTS models in terms of the technologies that are used as well as the quality of voices. Additionally, we make a short summary of the available TTS engines and their voices to check the availability of child voices. Finally, we examine the use of child voices in available tools and applications.

A review of the state of the child speech synthesis landscape

Pagliara, Silvio;Mura, Antonello;Gerazov, Branislav
2025-01-01

Abstract

Child voices in Text-To-Speech (TTS) are essential for enabling children with speech or communication difficulties to express themselves authentically, supporting their social inclusion and sense of identity. However, the development and availability of high-quality child voices remain limited compared to adult voices, mostly because of the lack of child data and ethical reasons considering that the voice is owned by a minor. In this paper we review the current state of the TTS synthesis technological landscape with special focus on systems that are offering child voices, as well as technology that is targeting offline use on low resource mobile devices, such as smartphones and tablets. We also explore research efforts in creating child TTS by reviewing papers on neural child TTS models in terms of the technologies that are used as well as the quality of voices. Additionally, we make a short summary of the available TTS engines and their voices to check the availability of child voices. Finally, we examine the use of child voices in available tools and applications.
2025
9783032016317
9783032016324
Text To Speech (TTS); Child TTS voices; Multi-speaker TTS
File in questo prodotto:
File Dimensione Formato  
A Review of the State of the Child Speech Synthesis Landscape.pdf

Solo gestori archivio

Tipologia: versione editoriale (VoR)
Dimensione 151.48 kB
Formato Adobe PDF
151.48 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11584/455405
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact