Title of article :
Multiclass cancer classification by support vector machines with class-wise optimized genes and probability estimates
Author/Authors :
Anand، نويسنده , , Ashish and Suganthan، نويسنده , , P.N.، نويسنده ,
Issue Information :
روزنامه با شماره پیاپی سال 2009
Pages :
8
From page :
533
To page :
540
Abstract :
We investigate the multiclass classification of cancer microarray samples. In contrast to classification of two cancer types from gene expression data, multiclass classification of more than two cancer types are relatively hard and less studied problem. We used class-wise optimized genes with corresponding one-versus-all support vector machine (OVA-SVM) classifier to maximize the utilization of selected genes. Final prediction was made by using probability scores from all classifiers. We used three different methods of estimating probability from decision value. Among the three probability methods, Plattʹs approach was more consistent, whereas, isotonic approach performed better for datasets with unequal proportion of samples in different classes. Probability based decision does not only gives true and fair comparison between different one-versus-all (OVA) classifiers but also gives the possibility of using them for any post analysis. Several ensemble experiments, an example of post analysis, of the three probability methods were implemented to study their effect in improving the classification accuracy. We observe that ensemble did help in improving the predictive accuracy of cancer data sets especially involving unbalanced samples. Four-fold external stratified cross-validation experiment was performed on the six multiclass cancer datasets to obtain unbiased estimates of prediction accuracies. Analysis of class-wise frequently selected genes on two cancer datasets demonstrated that the approach was able to select important and relevant genes consistent to literature. This study demonstrates successful implementation of the framework of class-wise feature selection and multiclass classification for prediction of cancer subtypes on six datasets.
Keywords :
Class-wise selected genes , Multiclass SVM , Cancer subtype classification , Probability outputs SVM , Ensemble SVM
Journal title :
Journal of Theoretical Biology
Serial Year :
2009
Journal title :
Journal of Theoretical Biology
Record number :
1539781
Link To Document :
بازگشت