Child voices in Text-To-Speech (TTS) are essential for enabling chil- dren with speech or communication difficulties to express themselves authenti- cally, supporting their social inclusion and sense of identity. However, the devel- opment and availability of high-quality child voices remain limited compared to adult voices, mostly because of the lack of child data and ethical reasons con- sidering that the voice is owned by a minor. In this paper we review the current state of the TTS synthesis technological landscape with special focus on systems that are offering child voices, as well as technology that is targeting offline use on low resource mobile devices, such as smartphones and tablets. We also explore research efforts in creating child TTS by reviewing papers on neural child TTS models in terms of the technologies that are used as well as the quality of voices. Additionally, we make a short summary of the available TTS engines and their voices to check the availability of child voices. Finally, we examine the use of child voices in available tools and applications.

A Review of the State of the Child Speech Synthesis Landscape

Pagliara, Silvio;Mura, Antonello;Gerazov, Branislav
2025-01-01

Abstract

Child voices in Text-To-Speech (TTS) are essential for enabling chil- dren with speech or communication difficulties to express themselves authenti- cally, supporting their social inclusion and sense of identity. However, the devel- opment and availability of high-quality child voices remain limited compared to adult voices, mostly because of the lack of child data and ethical reasons con- sidering that the voice is owned by a minor. In this paper we review the current state of the TTS synthesis technological landscape with special focus on systems that are offering child voices, as well as technology that is targeting offline use on low resource mobile devices, such as smartphones and tablets. We also explore research efforts in creating child TTS by reviewing papers on neural child TTS models in terms of the technologies that are used as well as the quality of voices. Additionally, we make a short summary of the available TTS engines and their voices to check the availability of child voices. Finally, we examine the use of child voices in available tools and applications.
2025
9783032016317
9783032016324
Text To Speech (TTS); Child TTS Voices; Multi-speaker TTS
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11584/455405
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact