Taxonomies are becoming essential to a growing number of application, particularly for specific domains. Taxonomies, originally built by hand, have been recently focused on their automatic generation. In particular, a main issue on automatic taxonomy building regards the choice of the most suitable features. In this paper, we propose an analy- sis on how each feature changes its role along taxonomy nodes in a text categorization scenario, in which the features are the terms in textual documents. We deem that, in a hierarchical structure, each node should intuitively be represented with proper meaningful and discriminant terms (i.e., performing a feature selection task for each node), instead of con- sidering a fixed feature space. To assess the discriminant power of a term, we adopt two novel metrics able to measure it. Our conjecture is that a term could significantly change its discriminant power (hence, its role) along the taxonomy levels. We perform experiments aimed at proving that a significant number of terms play different roles in each taxonomy node, giving emphasis to the usefulness of a distinct feature selection for each node. We assert that this analysis should support automatic taxonomy building approaches.

Analysis of term roles along taxonomy nodes by adopting discriminant and characteristic capabilities

ARMANO, GIULIANO;FANNI, FRANCESCA;GIULIANI, ALESSANDRO
2015-01-01

Abstract

Taxonomies are becoming essential to a growing number of application, particularly for specific domains. Taxonomies, originally built by hand, have been recently focused on their automatic generation. In particular, a main issue on automatic taxonomy building regards the choice of the most suitable features. In this paper, we propose an analy- sis on how each feature changes its role along taxonomy nodes in a text categorization scenario, in which the features are the terms in textual documents. We deem that, in a hierarchical structure, each node should intuitively be represented with proper meaningful and discriminant terms (i.e., performing a feature selection task for each node), instead of con- sidering a fixed feature space. To assess the discriminant power of a term, we adopt two novel metrics able to measure it. Our conjecture is that a term could significantly change its discriminant power (hence, its role) along the taxonomy levels. We perform experiments aimed at proving that a significant number of terms play different roles in each taxonomy node, giving emphasis to the usefulness of a distinct feature selection for each node. We assert that this analysis should support automatic taxonomy building approaches.
2015
Characteristic Capability, Discriminant Capability, Taxon-omy, Computer Science (all)
File in questo prodotto:
File Dimensione Formato  
2015-IIR-armano.pdf

accesso aperto

Tipologia: versione editoriale
Dimensione 595.84 kB
Formato Adobe PDF
595.84 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11584/197177
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact