DocumentCode :
58599
Title :
Gene-Expression-Based Cancer Subtypes Prediction Through Feature Selection and Transductive SVM
Author :
Maulik, Ujjwal ; Mukhopadhyay, Amit ; Chakraborty, Debasis
Author_Institution :
Dept. of Comput. Sci. & Eng., Jadavpur Univ., Kolkata, India
Volume :
60
Issue :
4
fYear :
2013
fDate :
Apr-13
Firstpage :
1111
Lastpage :
1117
Abstract :
With the advancement of microarray technology, gene expression profiling has shown great potential in outcome prediction for different types of cancers. Microarray cancer data, organized as samples versus genes fashion, are being exploited for the classification of tissue samples into benign and malignant or their subtypes. They are also useful for identifying potential gene markers for each cancer subtype, which helps in successful diagnosis of particular cancer type. Nevertheless, small sample size remains a bottleneck to design suitable classifiers. Traditional supervised classifiers can only work with labeled data. On the other hand, a large number of microarray data that do not have adequate follow-up information are disregarded. A novel approach to combine feature (gene) selection and transductive support vector machine (TSVM) is proposed. We demonstrated that 1) potential gene markers could be identified and 2) TSVMs improved prediction accuracy as compared to the standard inductive SVMs (ISVMs). A forward greedy search algorithm based on consistency and a statistic called signal-to-noise ratio were employed to obtain the potential gene markers. The selected genes of the microarray data were then exploited to design the TSVM. Experimental results confirm the effectiveness of the proposed technique compared to the ISVM and low-density separation method in the area of semisupervised cancer classification as well as gene-marker identification.
Keywords :
cancer; genetics; medical computing; patient diagnosis; statistical analysis; support vector machines; tumours; benign cancer; feature selection; forward greedy search algorithm; gene expression profiling; gene-expression-based cancer subtype prediction; gene-marker identification; low-density separation method; malignant cancer; microarray cancer data; microarray technology; patient diagnosis; prediction accuracy; semisupervised cancer classification; signal-noise ratio; standard inductive SVM; statistical analysis; tissue samples; traditional supervised classifiers; transductive SVM; transductive support vector machine; Accuracy; Cancer; Gene expression; Signal to noise ratio; Support vector machines; Training; Tumors; Low-density separation (LDS); microarray data; semisupervised classification; support vector machines (SVM); transductive learning; Algorithms; Computational Biology; Databases, Genetic; Gene Expression Profiling; Genetic Markers; Humans; Models, Biological; Neoplasms; Signal-To-Noise Ratio; Support Vector Machines; Tumor Markers, Biological;
fLanguage :
English
Journal_Title :
Biomedical Engineering, IEEE Transactions on
Publisher :
ieee
ISSN :
0018-9294
Type :
jour
DOI :
10.1109/TBME.2012.2225622
Filename :
6334430
Link To Document :
بازگشت