DocumentCode :
1814545
Title :
Identifying simple discriminatory gene vectors with an information theory approach
Author :
Yun, Zheng ; Keong, Kwoh Chee
Author_Institution :
BIRC, Nanyang Technol. Univ., Singapore
fYear :
2005
fDate :
8-11 Aug. 2005
Firstpage :
13
Lastpage :
24
Abstract :
In the feature selection of cancer classification problems, many existing methods consider genes individually by choosing the top genes which have the most significant signal-to-noise statistic or correlation coefficient. However the information of the class distinction provided by such genes may overlap intensively, since their gene expression patterns are similar The redundancy of including many genes with similar gene expression patterns results in highly complex classifiers. According to the principle of Occam´s razor, simple models are preferable to complex ones, if they can produce comparable prediction performances to the complex ones. In this paper, we introduce a new method to learn accurate and low-complexity classifiers from gene expression profiles. In our method, we use mutual information to measure the relation between a set of genes, called gene vectors, and the class attribute of the samples. The gene vectors are in higher-dimensional spaces than individual genes, therefore, they are more diverse, or contain more information than individual genes. Hence, gene vectors are more preferable to individual genes in describing the class distinctions between samples since they contain more information about the class attribute. We validate our method on 3 gene expression profiles. By comparing our results with those from literature and other well-known classification methods, our method demonstrated better or comparable prediction performances to the existing methods, however, with lower-complexity models than existing methods.
Keywords :
cancer; genetics; medical computing; statistical analysis; vectors; Occam razor; cancer classification; class distinction; correlation coefficient; feature selection; gene expression; gene vectors; lower-complexity models; mutual information; signal-to-noise statistic coefficient; Cancer; Entropy; Equations; Gene expression; Genetic communication; Information theory; Mutual information; Phase noise; Predictive models; Statistics;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computational Systems Bioinformatics Conference, 2005. Proceedings. 2005 IEEE
Print_ISBN :
0-7695-2344-7
Type :
conf
DOI :
10.1109/CSB.2005.35
Filename :
1498002
Link To Document :
بازگشت