We investigate the implementation of multi-label classification algorithms with a reject option, as a mean to reduce the time required to human annotators and to attain a higher classification accuracy on automatically classified samples than the one which can be obtained without a reject option. Based on a recently proposed model of manual annotation time, we identify two approaches to implement a reject option, related to the two main manual annotation methods: browsing and tagging. In this paper we focus on the approach suitable to tagging, which consists in withholding either all or none of the category assignments of a given sample. We develop classification reliability measures to decide whether rejecting or not a sample, aimed at maximising classification accuracy on non-rejected ones. We finally evaluate the trade-off between classification accuracy and rejection rate that can be attained by our method, on three benchmark data sets related to text categorisation and image annotation tasks.
A Classification Approach with a Reject Option for Multi-label Problems
FUMERA, GIORGIO;ROLI, FABIO
2011-01-01
Abstract
We investigate the implementation of multi-label classification algorithms with a reject option, as a mean to reduce the time required to human annotators and to attain a higher classification accuracy on automatically classified samples than the one which can be obtained without a reject option. Based on a recently proposed model of manual annotation time, we identify two approaches to implement a reject option, related to the two main manual annotation methods: browsing and tagging. In this paper we focus on the approach suitable to tagging, which consists in withholding either all or none of the category assignments of a given sample. We develop classification reliability measures to decide whether rejecting or not a sample, aimed at maximising classification accuracy on non-rejected ones. We finally evaluate the trade-off between classification accuracy and rejection rate that can be attained by our method, on three benchmark data sets related to text categorisation and image annotation tasks.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.