In this paper we introduce a novel architecture to recognise and to predict human actions from video sequences. Specifically, this architecture will be part of a larger system meant to promote elders' active ageing. The system will consist of a robotic coach able to schedule daily exercises, listening to patients' requests, monitoring the exercises, and correcting the errors in the execution. Using a monocular RGB camera video stream as input, the proposed architecture will be able to recognise the movement performed by the elder and to predict the next expected visual (camera frames) and proprioceptive (encoders) sensory inputs. In order to keep track of past frames, a Convolutional Neural Network (CNN) with both standard and recurrent convolutional layers (ConvLSTM or ConvGRU) has been chosen. Based on the Predictive Coding paradigm, the network will recognise the actions and predict the future visuo-proprioceptive stimuli using a single architecture. The full robotic coach system will be implemented on an affordable humanoid robot, the NAO.

Video action recognition and prediction architecture for a robotic coach

Diego Reforgiato
2020-01-01

Abstract

In this paper we introduce a novel architecture to recognise and to predict human actions from video sequences. Specifically, this architecture will be part of a larger system meant to promote elders' active ageing. The system will consist of a robotic coach able to schedule daily exercises, listening to patients' requests, monitoring the exercises, and correcting the errors in the execution. Using a monocular RGB camera video stream as input, the proposed architecture will be able to recognise the movement performed by the elder and to predict the next expected visual (camera frames) and proprioceptive (encoders) sensory inputs. In order to keep track of past frames, a Convolutional Neural Network (CNN) with both standard and recurrent convolutional layers (ConvLSTM or ConvGRU) has been chosen. Based on the Predictive Coding paradigm, the network will recognise the actions and predict the future visuo-proprioceptive stimuli using a single architecture. The full robotic coach system will be implemented on an affordable humanoid robot, the NAO.
File in questo prodotto:
File Dimensione Formato  
videopaper6.pdf

accesso aperto

Tipologia: versione editoriale
Dimensione 2.51 MB
Formato Adobe PDF
2.51 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11584/308979
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact