The landscape of speech synthesis technology, particularly neural Text- to-Speech (TTS), has seen rapid advancements in recent years. This review exam- ines the current state of neural TTS systems and their availability for edge on- device deployment. Traditionally, neural TTS models have required substantial computational resources, limiting their application to server-based cloud imple- mentations. However, recent innovations in model architecture and synthesis tech- niques are making it possible to deploy these systems on edge devices with limited processing power. These developments are crucial for applications requiring low latency, in which it is necessary that data is processed locally without reliance on cloud services. One important use-case is in assistive technology such as Aug- mented and Alternative Communication (AAC) for users with speech impairments and screen readers for the visually impaired.

A Review of the State of the Speech Synthesis Technology Landscape – Neural TTS on the Edge

Gerazov, Branislav;Mura, Antonello;Pagliara, Silvio
Ultimo
2025-01-01

Abstract

The landscape of speech synthesis technology, particularly neural Text- to-Speech (TTS), has seen rapid advancements in recent years. This review exam- ines the current state of neural TTS systems and their availability for edge on- device deployment. Traditionally, neural TTS models have required substantial computational resources, limiting their application to server-based cloud imple- mentations. However, recent innovations in model architecture and synthesis tech- niques are making it possible to deploy these systems on edge devices with limited processing power. These developments are crucial for applications requiring low latency, in which it is necessary that data is processed locally without reliance on cloud services. One important use-case is in assistive technology such as Aug- mented and Alternative Communication (AAC) for users with speech impairments and screen readers for the visually impaired.
2025
9783032016317
9783032016324
neural speech synthesis; Text to speech; Augmented and Alternative Communication (AAC); screen readers; on-device
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11584/454627
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact