UNICA IRIS Institutional Research Information System

Multi-label classification problems usually occur in tasks related to information retrieval, like text and image annotation, and are receiving increasing attention from the machine learning and pattern recognition fields. One of the main issues under investigation is the development of classification algorithms capable of maximizing specific accuracy measures based on precision and recall. We focus on the widely used F measure, defined for binary, single-label problems as the weighted harmonic mean of precision and recall, and later extended to multi-label problems in three ways: macro-averaged, micro-averaged and instance-wise. In this paper we give a comprehensive survey of theoretical results and algorithms aimed at maximizing F measures. We subdivide it according to the two main existing approaches: empirical utility maximization, and decision-theoretic. Under the former approach, we also derive the optimal (Bayes) classifier at the population level for the instance-wise and micro-averaged F, extending recent results about the single-label F. In a companion paper we shall focus on the micro-averaged F measure, for which relatively fewer solutions exist, and shall develop novel maximization algorithms under both approaches.

Designing multi-label classifiers that maximize F measures: state of the art

PILLAI, IGNAZIO;FUMERA, GIORGIO;ROLI, FABIO

2017-01-01

Abstract

Multi-label classification problems usually occur in tasks related to information retrieval, like text and image annotation, and are receiving increasing attention from the machine learning and pattern recognition fields. One of the main issues under investigation is the development of classification algorithms capable of maximizing specific accuracy measures based on precision and recall. We focus on the widely used F measure, defined for binary, single-label problems as the weighted harmonic mean of precision and recall, and later extended to multi-label problems in three ways: macro-averaged, micro-averaged and instance-wise. In this paper we give a comprehensive survey of theoretical results and algorithms aimed at maximizing F measures. We subdivide it according to the two main existing approaches: empirical utility maximization, and decision-theoretic. Under the former approach, we also derive the optimal (Bayes) classifier at the population level for the instance-wise and micro-averaged F, extending recent results about the single-label F. In a companion paper we shall focus on the micro-averaged F measure, for which relatively fewer solutions exist, and shall develop novel maximization algorithms under both approaches.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2017
			
	Parole chiave
	
				Multi-label classification; F measure; Learning algorithms; Empirical utility maximization; Decision-theoretic approach
			
	Tipologia:
	
				1.1 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
PattRec 2017a.pdf Solo gestori archivio Descrizione: Articolo principale Tipologia: versione editoriale (VoR) Dimensione 647.94 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	647.94 kB	Adobe PDF	Visualizza/Apri Richiedi una copia
paper.pdf accesso aperto Descrizione: Articolo principale Tipologia: versione post-print (AAM) Dimensione 804.99 kB Formato Adobe PDF Visualizza/Apri	804.99 kB	Adobe PDF	Visualizza/Apri

I metadati presenti in IRIS UNICA sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono protetti da diritto d'autore, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11584/196811

Citazioni

ND

73

63

ND

social impact