Title :
Feature selection in the classification of high-dimension data
Author :
Hua, Jianping ; Tembe, Waibhav ; Dougherty, Edward R.
Author_Institution :
Translational Genomics Res. Inst., Phoenix, AZ
Abstract :
Contemporary biological technologies produce extremely high-dimensional data sets with limited samples which demands feature selection in classifier design. Heretofore, dimensionalities considered in the existing comparative studies for feature selection are nowhere near those currently being used. This study compares some basic feature-selection methods in settings of 20,000 features, where it defines distribution models involving different kinds of relations among the features. The study evaluates the performances of different feature selection algorithms, which show some general trends relative to sample size and relations among the features.
Keywords :
biology computing; feature extraction; pattern classification; biological technologies; classifier design; distribution models; feature selection methods; high dimensional data classification; Bioinformatics; Biology computing; Covariance matrix; Data engineering; Design engineering; Design methodology; Filters; Gene expression; Genomics; Performance evaluation;
Conference_Titel :
Genomic Signal Processing and Statistics, 2008. GENSiPS 2008. IEEE International Workshop on
Conference_Location :
Phoenix, AZ
Print_ISBN :
978-1-4244-2371-2
Electronic_ISBN :
978-1-4244-2372-9
DOI :
10.1109/GENSIPS.2008.4555665