Multi-classlearningrequiresaclassifiertodiscriminateamongalargeset of L classes in order to define a classification rule able to identify the correct class for new observations. The resulting classification rule could not always be robust, particularly when imbalanced classes are observed or the data size is not large. In this paper a new approach is presented aimed at evaluating the reliability of a classification rule. It uses a standard classifier but it evaluates the reliability of the obtained classification rule by re-training the classifier on resampled versions of the original data. User-defined misclassification costs are assigned to the obtained confusion matrices and then used as inputs in a Beta regression model which provides a cost-sensitive weighted classification index. The latter is used jointly with another index measuring dissimilarity in distribution between observed classes and predicted ones. Both indices are defined in Œ0; 1 so that their values can be graphically represented in a Œ0; 1 2 space. The visual inspection of the points for each classifier allows us to evaluate its reliability on the basis of the relationship between the values of both indices obtained on the original data and on resampled versions of it.
Assessing the Reliability of a Multi-Class Classifier
FRIGAU, LUCA;CONVERSANO, CLAUDIO;MOLA, FRANCESCO
2016-01-01
Abstract
Multi-classlearningrequiresaclassifiertodiscriminateamongalargeset of L classes in order to define a classification rule able to identify the correct class for new observations. The resulting classification rule could not always be robust, particularly when imbalanced classes are observed or the data size is not large. In this paper a new approach is presented aimed at evaluating the reliability of a classification rule. It uses a standard classifier but it evaluates the reliability of the obtained classification rule by re-training the classifier on resampled versions of the original data. User-defined misclassification costs are assigned to the obtained confusion matrices and then used as inputs in a Beta regression model which provides a cost-sensitive weighted classification index. The latter is used jointly with another index measuring dissimilarity in distribution between observed classes and predicted ones. Both indices are defined in Œ0; 1 so that their values can be graphically represented in a Œ0; 1 2 space. The visual inspection of the points for each classifier allows us to evaluate its reliability on the basis of the relationship between the values of both indices obtained on the original data and on resampled versions of it.File | Dimensione | Formato | |
---|---|---|---|
Brema_Frigau.pdf
Solo gestori archivio
Descrizione: versione finale
Tipologia:
versione editoriale (VoR)
Dimensione
2.5 MB
Formato
Adobe PDF
|
2.5 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.