UNICA IRIS Institutional Research Information System

Crowd counting and density estimation are crucial functionalities in intelligent video surveillance systems but are also very challenging computer vision tasks in scenarios characterised by dense crowds, due to scale and perspective variations, overlapping and occlusions. Regression-based crowd counting models are used for dense crowd scenes, where pedestrian detection is infeasible. We focus on real-world, cross-scene application scenarios where no manually annotated images of the target scene are available for training regression models, but only images with different backgrounds and camera views can be used (e.g., from publicly available data sets), which can lead to low accuracy. To overcome this issue, we propose to build the training set using emph{synthetic} images of the target scene, which can be automatically annotated with no manual effort. This work provides a preliminary empirical evaluation of the effectiveness of the above solution. To this aim, we carry out experiments using real data sets as the target scenes (testing set) and using different kinds of synthetically generated crowd images of the target scenes as training data. Our results show that synthetic training images can be effective, provided that also their background, beside their perspective, closely reproduces the one of the target scene.

Investigating Synthetic Data Sets for Crowd Counting in Cross-scene Scenarios

Delussu, Rita;Putzu, Lorenzo;Fumera, Giorgio

2020-01-01

Abstract

Crowd counting and density estimation are crucial functionalities in intelligent video surveillance systems but are also very challenging computer vision tasks in scenarios characterised by dense crowds, due to scale and perspective variations, overlapping and occlusions. Regression-based crowd counting models are used for dense crowd scenes, where pedestrian detection is infeasible. We focus on real-world, cross-scene application scenarios where no manually annotated images of the target scene are available for training regression models, but only images with different backgrounds and camera views can be used (e.g., from publicly available data sets), which can lead to low accuracy. To overcome this issue, we propose to build the training set using emph{synthetic} images of the target scene, which can be automatically annotated with no manual effort. This work provides a preliminary empirical evaluation of the effectiveness of the above solution. To this aim, we carry out experiments using real data sets as the target scenes (testing set) and using different kinds of synthetically generated crowd images of the target scenes as training data. Our results show that synthetic training images can be effective, provided that also their background, beside their perspective, closely reproduces the one of the target scene.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2020
			
	Codice ISBN
	
				978-989-758-402-2
			
	Parole chiave
	
				Cross-scene, Crowd Analysis, Crowd Density Estimation, Synthetic Data Sets, Texture Features, Regression
			
	Tipologia:
	
				4.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
Paper.pdf accesso aperto Descrizione: Articolo principale Tipologia: versione post-print (AAM) Dimensione 4.4 MB Formato Adobe PDF Visualizza/Apri	4.4 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11584/298186

Citazioni

ND

7

6

social impact