DocumentCode :
951754
Title :
Data-Dependent Kernel Machines for Microarray Data Classification
Author :
Xiong, Huilin ; Zhang, Ya ; Chen, Xue-wen
Author_Institution :
Univ. of Kansas, Lawrence
Volume :
4
Issue :
4
fYear :
2007
Firstpage :
583
Lastpage :
595
Abstract :
One important application of gene expression analysis is to classify tissue samples according to their gene expression levels. Gene expression data are typically characterized by high dimensionality and small sample size, which makes the classification task quite challenging. In this paper, we present a data-dependent kernel for microarray data classification. This kernel function is engineered so that the class separability of the training data is maximized. A bootstrapping-based resampling scheme is introduced to reduce the possible training bias. The effectiveness of this adaptive kernel for microarray data classification is illustrated with a k-Nearest Neighbor (KNN) classifier. Our experimental study shows that the data-dependent kernel leads to a significant improvement in the accuracy of KNN classifiers. Furthermore, this kernel-based KNN scheme has been demonstrated to be competitive to, if not better than, more sophisticated classifiers such as Support Vector Machines (SVMs) and the Uncorrelated Linear Discriminant Analysis (ULDA) for classifying gene expression data.
Keywords :
cancer; data analysis; genetics; learning (artificial intelligence); medical computing; pattern classification; sampling methods; support vector machines; bootstrap-based resampling scheme; cancer; data-dependent kernel machine; gene expression analysis; k-Nearest Neighbor classifier; microarray data classification; support vector machines; tissue sample classification; uncorrelated linear discriminant analysis; Microarray data analysis; bootstrapping resampling; cancer classification; kernel machines; kernel optimization; Algorithms; Cell Line, Tumor; Computational Biology; Female; Gene Expression Profiling; Gene Expression Regulation, Neoplastic; Humans; Male; Models, Statistical; Neoplasms; Oligonucleotide Array Sequence Analysis; Pattern Recognition, Automated; Reproducibility of Results; Software;
fLanguage :
English
Journal_Title :
Computational Biology and Bioinformatics, IEEE/ACM Transactions on
Publisher :
ieee
ISSN :
1545-5963
Type :
jour
DOI :
10.1109/tcbb.2007.1048
Filename :
4359843
Link To Document :
بازگشت