Ensemble feature selection has been recently explored as a promising paradigm to improve the stability, i.e. the robustness with respect to sample variation, of subsets of informative features extracted from high-dimensional domains including genetics and medicine. Though recent literature discusses a number of cases where ensemble approaches seem to be capable of providing more stable results, especially in the context of biomarker discovery, there is a lack of systematic studies aiming at providing insight on when, and to which extent, the use of an ensemble method is to be preferred to a simple one. Using a well-known benchmark from the genomics domain, this paper presents an empirical study which evaluates ten selection methods, representatives of different selection approaches, investigating if they get significantly more stable when used in an ensemble fashion. Results of our study provide interesting indications on benefits and limitations of the ensemble paradigm in terms of stability.

Stability in biomarker discovery: does ensemble feature selection really help?

DESSI, NICOLETTA;PES, BARBARA
2015-01-01

Abstract

Ensemble feature selection has been recently explored as a promising paradigm to improve the stability, i.e. the robustness with respect to sample variation, of subsets of informative features extracted from high-dimensional domains including genetics and medicine. Though recent literature discusses a number of cases where ensemble approaches seem to be capable of providing more stable results, especially in the context of biomarker discovery, there is a lack of systematic studies aiming at providing insight on when, and to which extent, the use of an ensemble method is to be preferred to a simple one. Using a well-known benchmark from the genomics domain, this paper presents an empirical study which evaluates ten selection methods, representatives of different selection approaches, investigating if they get significantly more stable when used in an ensemble fashion. Results of our study provide interesting indications on benefits and limitations of the ensemble paradigm in terms of stability.
2015
978-3-319-19065-5
Ensemble feature selection; Feature selection stability; Biomarker discovery; High-dimensional data
File in questo prodotto:
File Dimensione Formato  
IEA-AIE_2015.pdf

Solo gestori archivio

Descrizione: Articolo principale
Tipologia: versione editoriale
Dimensione 304.29 kB
Formato Adobe PDF
304.29 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11584/109321
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 12
  • ???jsp.display-item.citation.isi??? 9
social impact