Random forests have been applied, with promising results, in analyzing datasets with large dimensionality and are receiving increasing attention for classification of microarray datasets. This paper examines random forests from an experimental perspective. It first aims at confirming their effectiveness in microarray data classification, but its main contribution is two-fold: to evaluate the effects of a filtering process which precedes the actual construction of the random forest and, in addition, to provide some insights about the behavior of random forest critical parameters, i.e. the forest size and the number of variable chosen at each split in growing trees. We experimented tuning these critical parameters in a public microarray dataset within a filter method. The paper gives suggestions on the optimal choice of these parameters and presents results which compare well with state-of-art methods for micro-array classification.
Pre-filtering Features in Random Forests for Microarray Data Classification
DESSI, NICOLETTA;MILIA, GABRIELE;PES, BARBARA
2012-01-01
Abstract
Random forests have been applied, with promising results, in analyzing datasets with large dimensionality and are receiving increasing attention for classification of microarray datasets. This paper examines random forests from an experimental perspective. It first aims at confirming their effectiveness in microarray data classification, but its main contribution is two-fold: to evaluate the effects of a filtering process which precedes the actual construction of the random forest and, in addition, to provide some insights about the behavior of random forest critical parameters, i.e. the forest size and the number of variable chosen at each split in growing trees. We experimented tuning these critical parameters in a public microarray dataset within a filter method. The paper gives suggestions on the optimal choice of these parameters and presents results which compare well with state-of-art methods for micro-array classification.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.