The consumer-level devices that track the user’s gestures eased the design and the implementation of interactive applications relying on body movements as input. Gesture recognition based on computer vision and machine-learning focuses mainly on accuracy and robustness. The resulting classifiers label precisely gestures after their performance, but they do not provide intermediate information during the execution. Human-Computer Interaction research focused instead on providing an easy and effective guidance for performing and discovering interactive gestures. The compositional approaches developed for solving such problem provide information on both the whole gesture and on its sub-parts, but they exploit heuristic techniques that have a low recognition accuracy. In this thesis, we introduce two methods, DEICTIC and G-Gene, designed for establishing a compromise between the accuracy and the provided information. DEICTIC exploits a compositional and declarative description for stroke gestures. It uses basic Hidden Markov Models (HMMs) to recognise meaningful predefined primitives (gesture sub-parts) and it composes them to recognise complex gestures. It provides information for supporting gesture guidance and it reaches an accuracy comparable with state-of-the-art approaches on two datasets from the literature. The normalization of the gesture samples limits the online recognition in the general case. Instead, G-Gene is a method for transforming compositional stroke gesture definitions into profile Hidden Markov Models (HMMs), able to provide both a good accuracy and information on gesture sub-parts. It supports online recognition without using any global feature, and it updates the information while receiving the input stream, with an accuracy useful for prototyping the interaction. We evaluated both approaches in a simplified development task with real developers, showing that they require less time and an effort comparable to compositional approaches, while the definition procedure and the perceived recognition accuracy is comparable to machine learning.

Combining declarative models and computer vision recognition algorithms for stroke gestures

CARCANGIU, ALESSANDRO
2019-02-08

Abstract

The consumer-level devices that track the user’s gestures eased the design and the implementation of interactive applications relying on body movements as input. Gesture recognition based on computer vision and machine-learning focuses mainly on accuracy and robustness. The resulting classifiers label precisely gestures after their performance, but they do not provide intermediate information during the execution. Human-Computer Interaction research focused instead on providing an easy and effective guidance for performing and discovering interactive gestures. The compositional approaches developed for solving such problem provide information on both the whole gesture and on its sub-parts, but they exploit heuristic techniques that have a low recognition accuracy. In this thesis, we introduce two methods, DEICTIC and G-Gene, designed for establishing a compromise between the accuracy and the provided information. DEICTIC exploits a compositional and declarative description for stroke gestures. It uses basic Hidden Markov Models (HMMs) to recognise meaningful predefined primitives (gesture sub-parts) and it composes them to recognise complex gestures. It provides information for supporting gesture guidance and it reaches an accuracy comparable with state-of-the-art approaches on two datasets from the literature. The normalization of the gesture samples limits the online recognition in the general case. Instead, G-Gene is a method for transforming compositional stroke gesture definitions into profile Hidden Markov Models (HMMs), able to provide both a good accuracy and information on gesture sub-parts. It supports online recognition without using any global feature, and it updates the information while receiving the input stream, with an accuracy useful for prototyping the interaction. We evaluated both approaches in a simplified development task with real developers, showing that they require less time and an effort comparable to compositional approaches, while the definition procedure and the perceived recognition accuracy is comparable to machine learning.
8-feb-2019
File in questo prodotto:
File Dimensione Formato  
tesi di dottorato_Alessandro Carcangiu.pdf

accesso aperto

Descrizione: tesi di dottorato
Dimensione 6.61 MB
Formato Adobe PDF
6.61 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11584/260670
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact