DocumentCode :
1080367
Title :
Significance of Gene Ranking for Classification of Microarray Samples
Author :
Chaolin Zhang ; Xuesong Lu ; Xuegong Zhang
Author_Institution :
Dept. of Biomed. Eng., State Univ. of New York, Stony Brook, NY
Volume :
3
Issue :
3
fYear :
2006
Firstpage :
312
Lastpage :
320
Abstract :
Many methods for classification and gene selection with microarray data have been developed. These methods usually give a ranking of genes. Evaluating the statistical significance of the gene ranking is important for understanding the results and for further biological investigations, but this question has not been well addressed for machine learning methods in existing works. Here, we address this problem by formulating it in the framework of hypothesis testing and propose a solution based on resampling. The proposed r-test methods convert gene ranking results into position p-values to evaluate the significance of genes. The methods are tested on three real microarray data sets and three simulation data sets with support vector machines as the method of classification and gene selection. The obtained position p-values help to determine the number of genes to be selected and enable scientists to analyze selection results by sophisticated multivariate methods under the same statistical inference paradigm as for simple hypothesis testing methods
Keywords :
biology computing; cellular biophysics; genetics; learning (artificial intelligence); molecular biophysics; statistical analysis; support vector machines; gene ranking; gene selection; hypothesis testing; machine learning methods; microarray sample classification; position p-values; r-test methods; resampling; sophisticated multivariate methods; statistical inference paradigm; support vector machines; Cancer; Chaos; Data analysis; Filtering; Learning systems; Statistical analysis; Statistical distributions; Support vector machine classification; Support vector machines; Testing; Significance of gene ranking; classification; gene selection; microarray data analysis.; Algorithms; Artificial Intelligence; Cluster Analysis; Gene Expression Profiling; Oligonucleotide Array Sequence Analysis; Sample Size;
fLanguage :
English
Journal_Title :
Computational Biology and Bioinformatics, IEEE/ACM Transactions on
Publisher :
ieee
ISSN :
1545-5963
Type :
jour
DOI :
10.1109/TCBB.2006.42
Filename :
1668029
Link To Document :
بازگشت