In several domains, including healthcare and home automation, it is important to unobtrusively monitor the activities of daily living (ADLs) carried out by people at home. A popular approach consists in the use of sensors attached to everyday objects to capture user interaction, and ADL models to recognize the current activity based on the temporal sequence of used objects. Often, ADL models are automatically extracted from labeled datasets of activities and sensor events, using supervised learning techniques. Unfortunately, acquiring such datasets in smart homes is expensive and violates users’ privacy. Hence, an alternative solution consists in manually defining ADL models based on common sense, exploiting logic languages such as description logics. However, manual specification of ADL ontologies is cumbersome, and rigid ontological definitions fail to capture the variability of activity execution. In this paper, we introduce a radically new approach enabled by the recent proliferation of tagged visual contents available on the Web. Indeed, thanks to the popularity of social network applications, people increasingly share pictures and videos taken during the execution of every kind of activity. Often, shared contents are tagged with metadata, manually specified by their owners, that concisely describe the depicted activity. Those metadata represent an implicit activity label of the picture or video. Moreover, today’s computer vision tools support accurate extraction of tags describing the situation and the objects that appear in the visual content. By reasoning with those tags and their corresponding activity labels, we can reconstruct accurate models of a comprehensive set of human activities executed in the most disparate situations. This approach overcomes the main shortcomings of existing techniques. Compared to supervised learning methods, it does not require the acquisition of training sets of sensor events and activities. Compared to knowledge-based methods, it does not involve any manual modeling effort, and it captures a comprehensive array of execution modalities. Through extensive experiments with large datasets of real-world ADLs, we show that this approach is practical and effective.

Sensor-based activity recognition: One picture is worth a thousand words

Riboni, Daniele
;
2019-01-01

Abstract

In several domains, including healthcare and home automation, it is important to unobtrusively monitor the activities of daily living (ADLs) carried out by people at home. A popular approach consists in the use of sensors attached to everyday objects to capture user interaction, and ADL models to recognize the current activity based on the temporal sequence of used objects. Often, ADL models are automatically extracted from labeled datasets of activities and sensor events, using supervised learning techniques. Unfortunately, acquiring such datasets in smart homes is expensive and violates users’ privacy. Hence, an alternative solution consists in manually defining ADL models based on common sense, exploiting logic languages such as description logics. However, manual specification of ADL ontologies is cumbersome, and rigid ontological definitions fail to capture the variability of activity execution. In this paper, we introduce a radically new approach enabled by the recent proliferation of tagged visual contents available on the Web. Indeed, thanks to the popularity of social network applications, people increasingly share pictures and videos taken during the execution of every kind of activity. Often, shared contents are tagged with metadata, manually specified by their owners, that concisely describe the depicted activity. Those metadata represent an implicit activity label of the picture or video. Moreover, today’s computer vision tools support accurate extraction of tags describing the situation and the objects that appear in the visual content. By reasoning with those tags and their corresponding activity labels, we can reconstruct accurate models of a comprehensive set of human activities executed in the most disparate situations. This approach overcomes the main shortcomings of existing techniques. Compared to supervised learning methods, it does not require the acquisition of training sets of sensor events and activities. Compared to knowledge-based methods, it does not involve any manual modeling effort, and it captures a comprehensive array of execution modalities. Through extensive experiments with large datasets of real-world ADLs, we show that this approach is practical and effective.
2019
Activity recognition; Intelligent systems; Pervasive computing; Activity models; Unsupervised reasoning
File in questo prodotto:
File Dimensione Formato  
19-fgcs.pdf

accesso aperto

Tipologia: versione pre-print
Dimensione 3.53 MB
Formato Adobe PDF
3.53 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11584/274168
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 17
  • ???jsp.display-item.citation.isi??? 13
social impact