UNICA IRIS Institutional Research Information System

In this paper we introduce a novel architecture to recognise and to predict human actions from video sequences. Specifically, this architecture will be part of a larger system meant to promote elders' active ageing. The system will consist of a robotic coach able to schedule daily exercises, listening to patients' requests, monitoring the exercises, and correcting the errors in the execution. Using a monocular RGB camera video stream as input, the proposed architecture will be able to recognise the movement performed by the elder and to predict the next expected visual (camera frames) and proprioceptive (encoders) sensory inputs. In order to keep track of past frames, a Convolutional Neural Network (CNN) with both standard and recurrent convolutional layers (ConvLSTM or ConvGRU) has been chosen. Based on the Predictive Coding paradigm, the network will recognise the actions and predict the future visuo-proprioceptive stimuli using a single architecture. The full robotic coach system will be implemented on an affordable humanoid robot, the NAO.

Video action recognition and prediction architecture for a robotic coach

Nino Cauli;Diego Reforgiato

2020-01-01

Abstract

In this paper we introduce a novel architecture to recognise and to predict human actions from video sequences. Specifically, this architecture will be part of a larger system meant to promote elders' active ageing. The system will consist of a robotic coach able to schedule daily exercises, listening to patients' requests, monitoring the exercises, and correcting the errors in the execution. Using a monocular RGB camera video stream as input, the proposed architecture will be able to recognise the movement performed by the elder and to predict the next expected visual (camera frames) and proprioceptive (encoders) sensory inputs. In order to keep track of past frames, a Convolutional Neural Network (CNN) with both standard and recurrent convolutional layers (ConvLSTM or ConvGRU) has been chosen. Based on the Predictive Coding paradigm, the network will recognise the actions and predict the future visuo-proprioceptive stimuli using a single architecture. The full robotic coach system will be implemented on an affordable humanoid robot, the NAO.

Scheda breve

Scheda completa

Scheda completa (DC)

Anno

2020

Tipologia:

4.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
videopaper6.pdf accesso aperto Tipologia: versione editoriale (VoR) Dimensione 2.51 MB Formato Adobe PDF Visualizza/Apri	2.51 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11584/308979

Citazioni

ND

3

ND

social impact