UNICA IRIS Institutional Research Information System

Thresholding strategies in automated text categorization are an underexplored area of research. Indeed, thresholding strategies are often considered a post-processing step of minor importance, the underlying assumptions being that they do not make a difference in the performance of a classifier and that finding the optimal thresholding strategy for any given classifier is trivial. Neither these assumptions are true. In this paper, we concentrate on progressive filtering, a hierarchical text categorization technique that relies on a local-classifier-per-node approach, thus mimicking the underlying taxonomy of categories. The focus of the paper is on assessing TSA, a greedy threshold selection algorithm, against a relaxed brute-force algorithm and the most relevant state-of-the-art algorithms. Experiments, performed on Reuters, confirm the validity of TSA.

A comparative study of thresholding strategies in progressive filtering

Addis A;ARMANO, GIULIANO;VARGIU, ELOISA

2011-01-01

Abstract

Thresholding strategies in automated text categorization are an underexplored area of research. Indeed, thresholding strategies are often considered a post-processing step of minor importance, the underlying assumptions being that they do not make a difference in the performance of a classifier and that finding the optimal thresholding strategy for any given classifier is trivial. Neither these assumptions are true. In this paper, we concentrate on progressive filtering, a hierarchical text categorization technique that relies on a local-classifier-per-node approach, thus mimicking the underlying taxonomy of categories. The focus of the paper is on assessing TSA, a greedy threshold selection algorithm, against a relaxed brute-force algorithm and the most relevant state-of-the-art algorithms. Experiments, performed on Reuters, confirm the validity of TSA.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2011
			
	Codice ISBN
	
				978-3-642-23953-3
			
	Tipologia:
	
				2.1 Contributo in volume (Capitolo o Saggio)

File in questo prodotto:

File	Dimensione	Formato
AIIA-2011-Addis.pdf Solo gestori archivio Tipologia: versione editoriale (VoR) Dimensione 408.68 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	408.68 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11584/77066

Citazioni

ND

0

0

social impact