UNICA IRIS Institutional Research Information System

Utilizing monocular cameras for 3D object understanding is widely recognized as a cost-effective approach, spanning applications such as autonomous driving, augmented/virtual reality or roadside monitoring. Despite recent progress, persistent challenges arise in creating generalized models adaptable to unforeseen scenarios and diverse camera configurations. In this work, we focus on the task of monocular 3D object detection within roadside environments. To begin, we introduce a versatile methodology for generating and labeling datasets tailored to roadside scenarios, addressing limitations encountered in real-world settings. Subsequently, we develop an array of deep learning models tailored to this task, refining them to address practical challenges that emerge during real-world application. Lastly, leveraging our framework, we curated a synthetic benchmark dataset comprising 1,415,680 frames and 8,902,636 labeled 3D objects, ultimately assessing the performance of existing models across all datasets.

RoadSense3D: A Framework for Roadside Monocular 3D Object Detection

Carta S.;Castrillon-Santana M.;Marras M.;Mohamed S.;Podda A. S.;Saia R.;Sau M.;Zimmer W.

2024-01-01

Abstract

Utilizing monocular cameras for 3D object understanding is widely recognized as a cost-effective approach, spanning applications such as autonomous driving, augmented/virtual reality or roadside monitoring. Despite recent progress, persistent challenges arise in creating generalized models adaptable to unforeseen scenarios and diverse camera configurations. In this work, we focus on the task of monocular 3D object detection within roadside environments. To begin, we introduce a versatile methodology for generating and labeling datasets tailored to roadside scenarios, addressing limitations encountered in real-world settings. Subsequently, we develop an array of deep learning models tailored to this task, refining them to address practical challenges that emerge during real-world application. Lastly, leveraging our framework, we curated a synthetic benchmark dataset comprising 1,415,680 frames and 8,902,636 labeled 3D objects, ultimately assessing the performance of existing models across all datasets.

Scheda breve

Scheda completa

Scheda completa (DC)

Anno

2024

Parole chiave

3D Object Detection
Monocular 3D Perception
Roadside Dataset

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I metadati presenti in IRIS UNICA sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono protetti da diritto d'autore, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11584/432645

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

6

4

ND

social impact