Nowadays, intelligent video surveillance systems are being developed to support human operators in different monitoring and investigation tasks. Although relevant results have been achieved by the research community in several computer vision tasks, some real applications still exhibit several open issues. In this context, this thesis focused on two challenging computer vision tasks: person re-identification and crowd counting. Person re-identification aims to retrieve images of a person of interest, selected by the user, in different locations over time, reducing the time required to the user to analyse all the available videos. Crowd counting consists of estimating the number of people in a given image or video. Both tasks present several complex issues. In this thesis, a challenging video surveillance application scenario is considered in which it is not possible to collect and manually annotate images of a target scene (e.g., when a new camera installation is made by Law Enforcement Agency) to train a supervised model. Two human centered solutions for the above mentioned tasks are then proposed, in which the role of the human operators is fundamental. For person re-identification, the human-in-the-loop approach is proposed, which exploits the operator feedback on retrieved pedestrian images during system operation, to improve system's effectiveness. The proposed solution is based on revisiting relevance feedback algorithms for content-based image retrieval, and on developing a specific feedback protocol, to find a trade-off between the human effort and re-identification performance. For crowd counting, the use of a synthetic training set is proposed to develop a scene-specific model, based on a minimal amount of information of the target scene required to the user. Both solutions are empirically investigated using state-of-the-art supervised models based on Convolutional Neural Network, on benchmark data sets.
Human Centered Computer Vision Techniques for Intelligent Video Surveillance Systems
DELUSSU, RITA
2021-02-23
Abstract
Nowadays, intelligent video surveillance systems are being developed to support human operators in different monitoring and investigation tasks. Although relevant results have been achieved by the research community in several computer vision tasks, some real applications still exhibit several open issues. In this context, this thesis focused on two challenging computer vision tasks: person re-identification and crowd counting. Person re-identification aims to retrieve images of a person of interest, selected by the user, in different locations over time, reducing the time required to the user to analyse all the available videos. Crowd counting consists of estimating the number of people in a given image or video. Both tasks present several complex issues. In this thesis, a challenging video surveillance application scenario is considered in which it is not possible to collect and manually annotate images of a target scene (e.g., when a new camera installation is made by Law Enforcement Agency) to train a supervised model. Two human centered solutions for the above mentioned tasks are then proposed, in which the role of the human operators is fundamental. For person re-identification, the human-in-the-loop approach is proposed, which exploits the operator feedback on retrieved pedestrian images during system operation, to improve system's effectiveness. The proposed solution is based on revisiting relevance feedback algorithms for content-based image retrieval, and on developing a specific feedback protocol, to find a trade-off between the human effort and re-identification performance. For crowd counting, the use of a synthetic training set is proposed to develop a scene-specific model, based on a minimal amount of information of the target scene required to the user. Both solutions are empirically investigated using state-of-the-art supervised models based on Convolutional Neural Network, on benchmark data sets.File | Dimensione | Formato | |
---|---|---|---|
Delussu_PhD_Thesis.pdf
accesso aperto
Descrizione: Human Centered Computer Vision Techniques for Intelligent Video Surveillance Systems
Tipologia:
Tesi di dottorato
Dimensione
6.31 MB
Formato
Adobe PDF
|
6.31 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.