UNICA IRIS Institutional Research Information System

Road surveillance systems play an important role in traffic monitoring and detecting hazardous events. In recent years, several artificial intelligence-based approaches have been proposed for this purpose, typically based on the analysis of the acquired video streams. However, occlusions, poor lighting conditions, and heterogeneity of the events may often reduce their effectiveness and reliability. To overcome the limitations mentioned, scientific and industrial research has therefore focused on integrating such solutions with audio recognition methods. By automatically identifying anomalous traffic sounds, e.g., car crashes and skids, they help reduce false positives and missed alarms. Following this trend, in this work, we propose an innovative pipeline for the analysis of intensity-projected audio spectrograms from streams of traffic sounds, which exploits both (i) a visual approach based on a custom, special-purpose Convolutional Neural Network for the identification of anomalous events on the sound signal; and, (ii) a novel multi-representational encoding of the input, which proved to significantly improve the recognition accuracy of the neural models. The validation results of the proposed pipeline on the public MIVIA dataset, with a 0.96% of false positive rate, showed to be the best performance against the stateof-the-art competitors. Notably, following such findings, a prototype implementation has been deployed on a real-world video surveillance infrastructure.

CARgram: CNN-based accident recognition from road sounds through intensity-projected spectrogram analysis

Podda, Alessandro Sebastian;Balia, Riccardo;Pompianu, Livio;Carta, Salvatore;Fenu, Gianni;Saia, Roberto

2024-01-01

Abstract

Road surveillance systems play an important role in traffic monitoring and detecting hazardous events. In recent years, several artificial intelligence-based approaches have been proposed for this purpose, typically based on the analysis of the acquired video streams. However, occlusions, poor lighting conditions, and heterogeneity of the events may often reduce their effectiveness and reliability. To overcome the limitations mentioned, scientific and industrial research has therefore focused on integrating such solutions with audio recognition methods. By automatically identifying anomalous traffic sounds, e.g., car crashes and skids, they help reduce false positives and missed alarms. Following this trend, in this work, we propose an innovative pipeline for the analysis of intensity-projected audio spectrograms from streams of traffic sounds, which exploits both (i) a visual approach based on a custom, special-purpose Convolutional Neural Network for the identification of anomalous events on the sound signal; and, (ii) a novel multi-representational encoding of the input, which proved to significantly improve the recognition accuracy of the neural models. The validation results of the proposed pipeline on the public MIVIA dataset, with a 0.96% of false positive rate, showed to be the best performance against the stateof-the-art competitors. Notably, following such findings, a prototype implementation has been deployed on a real-world video surveillance infrastructure.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2024
			
	Parole chiave
	
				Convolutional neural networks; Deep learning; Audio analysis; Traffic surveillance; Artificial Intelligence
			
	Tipologia:
	
				1.1 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
1-s2.0-S1051200424000563-main.pdf accesso aperto Tipologia: versione editoriale (VoR) Dimensione 1.96 MB Formato Adobe PDF Visualizza/Apri	1.96 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11584/425794

Citazioni

ND

5

2

social impact