Feature
selection for genomic data. Paola CERCHIELLO, Silvia FIGINI.
La revue MODULAD, numéro 36, Juillet 2007
Abstract:
Building predictive models
for genomic mining requires feature selection, as an essential
preliminary step to reduce the large number of available
variable. Feature selection in the process of select
a generally smaller subset of variables (features) that can be
considered the best, from a statistical point of view, with respect
to the employed model for the analysis. In gene expression
microarray data, being able to select a few number of important
genes not only makes data analysis efficient
but also helps their biological interpretation. Microarray data
have typically several thousands of genes (features) but only tens
of samples.
Problems which can occur due to the small sample size have not
been addressed well in the literature. Our aim is
to discuss some issues on feature selection applied to microarray
data in order to select the most important genes from
a predictive point of view.
Keywords: Feature selection, Gene expression, Marker Selection,
Kruskal-Wallis
test, Model Assessment, Predictive models.
Download paper : Feature
selection for genomic data
Download slides : Feature
selection for genomic data
|