UNICA IRIS Institutional Research Information System

Several Multi-class text classification (MCC) strategies, namely One-Vs-Rest (OVA), One-Vs-One (OVO), Best-of-Best (BOB), and Error-Correcting-Output-Codes (ECOC), are compared in terms of accuracy and computational efficiency. Each strategy is implemented utilizing several classifiers such as Naïve Bayes, Random Forest, Logistic Regression, Neural Networks, Linear Discriminant Analysis, Support Vector Machine, and the recently-introduced Threshold-based Naïve Bayes (Tb-NB). We run a horse race involving the analysis of the 20News-Group dataset, well known in the literature for its complexity. Our results highlight the importance of choosing the right classifier whilst pairing it with an optimal strategy, providing valuable insights for optimizing classifier performance in MCC classification tasks considering both environmental implications and the need for accurate predictions.

Multi-class text classification of news data

Maurizio Romano^Primo;Maria Paola Priola^Ultimo

2024-01-01

Abstract

Several Multi-class text classification (MCC) strategies, namely One-Vs-Rest (OVA), One-Vs-One (OVO), Best-of-Best (BOB), and Error-Correcting-Output-Codes (ECOC), are compared in terms of accuracy and computational efficiency. Each strategy is implemented utilizing several classifiers such as Naïve Bayes, Random Forest, Logistic Regression, Neural Networks, Linear Discriminant Analysis, Support Vector Machine, and the recently-introduced Threshold-based Naïve Bayes (Tb-NB). We run a horse race involving the analysis of the 20News-Group dataset, well known in the literature for its complexity. Our results highlight the importance of choosing the right classifier whilst pairing it with an optimal strategy, providing valuable insights for optimizing classifier performance in MCC classification tasks considering both environmental implications and the need for accurate predictions.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2024
			
	Codice ISBN
	
				978-88-5509-645-4
			
	Parole chiave
	
				Statistical Learning; One-Vs-Rest; One-Vs-All; Naïve Bayes; Tb-Nb; Accuracy; 20NewsGroup dataset

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11584/420444

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

ND

ND

social impact