UNICA IRIS Institutional Research Information System

The landscape of speech synthesis technology, particularly neural Text-to-Speech (TTS), has seen rapid advancements in recent years. This review examines the current state of neural TTS systems and their availability for edge on-device deployment. Traditionally, neural TTS models have required substantial computational resources, limiting their application to server-based cloud implementations. However, recent innovations in model architecture and synthesis techniques are making it possible to deploy these systems on edge devices with limited processing power. These developments are crucial for applications requiring low latency, in which it is necessary that data is processed locally without reliance on cloud services. One important use-case is in assistive technology such as Augmented and Alternative Communication (AAC) for users with speech impairments and screen readers for the visually impaired.

A review of the state of the speech synthesis technology landscape – Neural TTS on the edge

Gerazov, Branislav;Lazareva, Vanesa;Dimitrovska, Marija Markovska;Taskovski, Dimitar;Mavrou, Katerina;Theodorou, Eleni;Zanfardino, Francesco;Spera, Antonio;Mura, Antonello;Rybińska, Anna;May ;Agius, null;Todorovska, Danche;Charalambous-Darden, Nefi;Łuszczak, Katarzyna;Pagliara, Silvio^Ultimo

2025-01-01

Abstract

The landscape of speech synthesis technology, particularly neural Text-to-Speech (TTS), has seen rapid advancements in recent years. This review examines the current state of neural TTS systems and their availability for edge on-device deployment. Traditionally, neural TTS models have required substantial computational resources, limiting their application to server-based cloud implementations. However, recent innovations in model architecture and synthesis techniques are making it possible to deploy these systems on edge devices with limited processing power. These developments are crucial for applications requiring low latency, in which it is necessary that data is processed locally without reliance on cloud services. One important use-case is in assistive technology such as Augmented and Alternative Communication (AAC) for users with speech impairments and screen readers for the visually impaired.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2025
			
	Codice ISBN
	
				9783032016317
9783032016324
			
	Parole chiave
	
				Neural speech synthesis; Text to speech; Augmented and Alternative Communication (AAC); Screen readers; On-device
			
	Tipologia:
	
				4.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
A Review of the State of the Speech Synthesis Technology Landscape – Neural TTS on the Edge.pdf Solo gestori archivio Tipologia: versione editoriale (VoR) Dimensione 370.19 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	370.19 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11584/454627

Citazioni

ND

ND

ND

social impact