In this paper, we address the challenging task of learning accurate classifiers from micro-array datasets involving a large number of features but only a small number of samples. We present a greedy step-by-step procedure (SSFS) that can be used to reduce the dimensionality of the feature space. We apply the Minimum Description Length principle to the training data for weighting each feature and then select an “optimal” feature subset by a greedy approach tuned to a specific classifier. The Acute Lymphoblastic Leukemia dataset is used to evaluate the effectiveness of the SSFS procedure in conjunction with different state-of-the-art classification algorithms.

Learning Classifiers for High-Dimensional Micro-array Data

BOSIN, ANDREA;DESSI, NICOLETTA;PES, BARBARA
2006-01-01

Abstract

In this paper, we address the challenging task of learning accurate classifiers from micro-array datasets involving a large number of features but only a small number of samples. We present a greedy step-by-step procedure (SSFS) that can be used to reduce the dimensionality of the feature space. We apply the Minimum Description Length principle to the training data for weighting each feature and then select an “optimal” feature subset by a greedy approach tuned to a specific classifier. The Acute Lymphoblastic Leukemia dataset is used to evaluate the effectiveness of the SSFS procedure in conjunction with different state-of-the-art classification algorithms.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11584/102009
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? 0
social impact