Abstract :
In microarray-based cancer classification, feature selection and classification method is an important issue owing to large number of variables (gene expressions) and small number of experimental conditions. For disease diagnosing, classifiers´ performance has direct impact on final results. In this paper, a new method of gene selection and classification by using nonlinear kernel support vector machine(SVM) based on recursive performance elimination(RFE) is proposed. It is demonstrated experimentally that our method has better comprehensive performance than other linear classification methods, such as linear kernel support vector machine and fisher linear discriminant analysis (FLDA), also better than some non-linear classification methods, such as least square support vector machine(LS-SVM) using non-linear kernel. In the experiments, besides test set, leave-one-out algorithm is also used to test the classifiers´ generalization performance. AML/ALL dataset and hereditary breast cancer dataset are used, which are available on internet.
Keywords :
biological organs; cancer; feature extraction; genetics; gynaecology; learning (artificial intelligence); medical diagnostic computing; molecular biophysics; pattern classification; statistical analysis; support vector machines; tumours; FLDA; LS-SVM; RFE; SVM; disease diagnosis; feature classification; feature selection; fisher linear discriminant analysis; gene classification; gene expression; gene selection; hereditary breast cancer dataset; least square support vector machine; leave-one-out algorithm; linear kernel support vector machine; microarray-based cancer classification; nonlinear kernel support vector machine; recursive performance elimination; Breast cancer; Diseases; Gene expression; Internet; Kernel; Least squares methods; Linear discriminant analysis; Support vector machine classification; Support vector machines; Testing; Data classification; Gene selection; Support vector machine;