Class imbalance is an issue in many real world applications because classification algorithms tend to misclassify instances from the class of interest when its training samples are outnumbered by those of other classes. Several variations of AdaBoost ensemble method have been proposed in literature to learn from imbalanced data based on re-sampling. However, their loss factor is based on standard accuracy, which still biases performance towards the majority class. This problem is mitigated using cost-sensitive Boosting algorithms, although it can be avoided at the outset by modifying the loss factor calculation. In this paper, two loss factors, based on F-measure and G-mean are proposed that are more suitable to deal with imbalanced data during the Boosting learning process. The performance of standard AdaBoost and of three specialized versions for class imbalance (SMOTEBoost, RUSBoost, and RB-Boost) are empirically evaluated using the proposed loss factors, both on synthetic data and on a real-world face re-identification task. Experimental results show a significant performance improvement on AdaBoost and RUSBoost with the proposed loss factors.

Classifier Ensembles with Trajectory Under-Sampling for Face Re-Identification

Fumera, Giorgio
Ultimo
;
2016-01-01

Abstract

Class imbalance is an issue in many real world applications because classification algorithms tend to misclassify instances from the class of interest when its training samples are outnumbered by those of other classes. Several variations of AdaBoost ensemble method have been proposed in literature to learn from imbalanced data based on re-sampling. However, their loss factor is based on standard accuracy, which still biases performance towards the majority class. This problem is mitigated using cost-sensitive Boosting algorithms, although it can be avoided at the outset by modifying the loss factor calculation. In this paper, two loss factors, based on F-measure and G-mean are proposed that are more suitable to deal with imbalanced data during the Boosting learning process. The performance of standard AdaBoost and of three specialized versions for class imbalance (SMOTEBoost, RUSBoost, and RB-Boost) are empirically evaluated using the proposed loss factors, both on synthetic data and on a real-world face re-identification task. Experimental results show a significant performance improvement on AdaBoost and RUSBoost with the proposed loss factors.
2016
978-989-758-173-1
File in questo prodotto:
File Dimensione Formato  
ICPRAM8_CR.pdf

Solo gestori archivio

Descrizione: Articolo principale
Tipologia: versione post-print
Dimensione 556.82 kB
Formato Adobe PDF
556.82 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11584/236222
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 5
  • ???jsp.display-item.citation.isi??? ND
social impact