UNICA IRIS Institutional Research Information System

This paper proposes a novel method for robust visual tracking of arbitrary objects, based on the combination of image-based prediction and position refinement by weighted correlation. The effectiveness of the proposed approach is demonstrated on a challenging set of dynamic video sequences, extracted from the final of triple jump at the London 2012 Summer Olympics. A comparison is made against five baseline tracking systems. The novel system shows remarkable superior performances with respect to the other methods, in all considered cases characterized by changing background, and a large variety of articulated motions. The novel architecture, from here onward named 2D Recurrent Neural Network (2D-RNN), is derived from the well-known recurrent neural network model and adopts nearest neighborhood connections between the input and context layers in order to store the temporal information content of the video. Starting from the selection of the object of interest in the first frame, neural computation is applied to predict the position of the target in each video frame. Normalized cross-correlation is then applied to refine the predicted target position. 2D-RNN ensures limited complexity, great adaptability and a very fast learning time. At the same time, it shows on the considered dataset fast execution times and very good accuracy, making this approach an excellent candidate for automated analysis of complex video streams.

2D recurrent neural networks: a high-performance tool for robust visual tracking in dynamic scenes

Masala, Giovanni;Casu, Filippo;Golosio, Bruno;Grosso, Enrico

2018-01-01

Abstract

This paper proposes a novel method for robust visual tracking of arbitrary objects, based on the combination of image-based prediction and position refinement by weighted correlation. The effectiveness of the proposed approach is demonstrated on a challenging set of dynamic video sequences, extracted from the final of triple jump at the London 2012 Summer Olympics. A comparison is made against five baseline tracking systems. The novel system shows remarkable superior performances with respect to the other methods, in all considered cases characterized by changing background, and a large variety of articulated motions. The novel architecture, from here onward named 2D Recurrent Neural Network (2D-RNN), is derived from the well-known recurrent neural network model and adopts nearest neighborhood connections between the input and context layers in order to store the temporal information content of the video. Starting from the selection of the object of interest in the first frame, neural computation is applied to predict the position of the target in each video frame. Normalized cross-correlation is then applied to refine the predicted target position. 2D-RNN ensures limited complexity, great adaptability and a very fast learning time. At the same time, it shows on the considered dataset fast execution times and very good accuracy, making this approach an excellent candidate for automated analysis of complex video streams.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2018
			
	Parole chiave
	
				Automated video analysis; Convolutional network; Recurrent neural network; Video tracking; Software; Artificial Intelligence
			
	Tipologia:
	
				1.1 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
Masala2018_Article_2DRecurrentNeuralNetworksAHigh.pdf Solo gestori archivio Tipologia: versione editoriale (VoR) Dimensione 1.47 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	1.47 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11584/236184

Citazioni

ND

3

3

social impact