Title :
Feature subset selection and parameters optimization for support vector machine in breast cancer diagnosis
Author :
Olfati, Elnaz ; Zarabadipour, Hassan ; Shoorehdeli, Mahdi Aliyari
Author_Institution :
Dept. of Electr. Eng., Imam Khomeini Int. Univ., Qazvin, Iran
Abstract :
Due to high death rate in women with breast cancer, the detection will play a major role in the treatment of this type of cancer. Therefore, the early detection of breast cancer will increase the patients´ chances of survival. The main tendency in feature extraction has been illustrating the data in a lower dimensional and different feature space, for instance, using principal component analysis (PCA). In this paper, we argue that feature selection depend on top of eigenvalue certainly is not proper because they may not encode useful information for classilcation purposes, features should be selected form all the components by feature selection methods. So, Genetic Algorithm (GA) is used in the most favorable selection of principal components instead of using classical method. We have applied PCA for dimension reduction, genetic algorithms for feature selection and support vector machines for classification. The estimate of this Algorithm has been done based on Wisconsin Breast Cancer Dataset (WBCD) which is commonly used among researchers who use machine learning methods for breast cancer diagnosis. The performance of this approach is given. In addition, the methods used in the past have been compared to the performance of the chosen approach. This approach affords optimal classification which is capable to minimize amount of features and maximize the accuracy sensitivity, specificity and receiver operating characteristic (ROC) curves. 10-fold cross-validation has been used on the classification phase. The average classification accuracy of the developed PCA+GA+SVM system is obtained 100% for a subset that contained two features. This is very favorable compared to the previously reported results.
Keywords :
cancer; feature extraction; genetic algorithms; learning (artificial intelligence); medical computing; patient treatment; principal component analysis; support vector machines; GA; PCA; ROC curves; Wisconsin breast cancer dataset; breast cancer detection; breast cancer diagnosis; cancer treatment; feature extraction; feature selection; feature subset selection; genetic algorithm; machine learning methods; parameters optimization; principal component analysis; receiver operating characteristic; support vector machine; support vector machines; Accuracy; Breast cancer; Feature extraction; Genetic algorithms; Principal component analysis; Support vector machines; Breast cancer diagnosis; Feature subset selection; Genetic algorithm (GA); Principal component analysis (PCA); Support vector machine(SVM);
Conference_Titel :
Intelligent Systems (ICIS), 2014 Iranian Conference on
Conference_Location :
Bam
Print_ISBN :
978-1-4799-3350-1
DOI :
10.1109/IranianCIS.2014.6802601