DocumentCode :
1988560
Title :
Minimum redundancy feature selection from microarray gene expression data
Author :
Ding, Chris ; Peng, Hanchuan
Author_Institution :
NERSC Div., California Univ., Berkeley, CA, USA
fYear :
2003
fDate :
11-14 Aug. 2003
Firstpage :
523
Lastpage :
528
Abstract :
Selecting a small subset of genes out of the thousands of genes in microarray data is important for accurate classification of phenotypes. Widely used methods typically rank genes according to their differential expressions among phenotypes and pick the top-ranked genes. We observe that feature sets so obtained have certain redundancy and study methods to minimize it. Feature sets obtained through the minimum redundancy - maximum relevance framework represent broader spectrum of characteristics of phenotypes than those obtained through standard ranking methods; they are more robust, generalize well to unseen data, and lead to significantly improved classifications in extensive experiments on 5 gene expressions data sets.
Keywords :
arrays; feature extraction; genetics; redundancy; microarray gene expression data; minimum redundancy - maximum relevance framework; minimum redundancy feature selection; phenotypes; spectrum characteristic; Bioinformatics; Cancer; Diseases; Filters; Gene expression; Laboratories; Learning systems; Proteins; Robustness; Testing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Bioinformatics Conference, 2003. CSB 2003. Proceedings of the 2003 IEEE
Print_ISBN :
0-7695-2000-6
Type :
conf
DOI :
10.1109/CSB.2003.1227396
Filename :
1227396
Link To Document :
بازگشت