DocumentCode :
1757988
Title :
Mass Classification in Mammograms Using Selected Geometry and Texture Features, and a New SVM-Based Feature Selection Method
Author :
Xiaoming Liu ; Jinshan Tang
Author_Institution :
Coll. of Comput. Sci. & Technol., Wuhan Univ. of Sci. & Technol., Wuhan, China
Volume :
8
Issue :
3
fYear :
2014
fDate :
Sept. 2014
Firstpage :
910
Lastpage :
920
Abstract :
Masses are the primary indications of breast cancer in mammograms, and it is important to classify them as benign or malignant. Benign and malignant masses differ in geometry and texture characteristics. However, not every geometry and texture feature that is extracted contributes to the improvement of classification accuracy; thus, to select the best features from a set is important. In this paper, we examine the feature selection methods for mass classification. We integrate a support vector machine (SVM)-based recursive feature elimination (SVM-RFE) procedure with a normalized mutual information feature selection (NMIFS) to avoid their singular disadvantages (the redundancy in the selected features of the SVM-RFE and the unoptimized classifier for the NMIFS) while retaining their advantages, and we propose a new feature selection method, which is called the SVM-RFE with an NMIFS filter (SRN). In addition to feature selection, we also study the initialization of mass segmentation. Different initialization methods are investigated, and we propose a fuzzy c-means (FCM) clustering, with spatial constraints as the initialization step. In the experiments, 826 regions of interest (ROIs) from the Digital Database for Screening Mammography were used. All 826 were used in the classification experiments, and 413 ROIs were used in the feature selection experiments. Different feature selection methods, including F-score, Relief, SVM-RFE, SVM-RFE with a minimum redundancy-maximum relevance (mRMR) filter [SVM-RFE (mRMR)], and SRN, were used to select features and to compare mass classification results using the selected features. In the classification experiments, the linear discriminant analysis and the SVM classifiers were investigated. The accuracy that is obtained with the SVM classifier using the selected features obtained by the F-score, Relief, SVM-RFE, SVM-RFE (mRMR), and SRN methods are 88%, 88%, 90%, 91%, and 93%, respectively, with a tenfold cross-validation procedu- e, and 91%, 89%, 92%, 92%, and 94%, respectively, with a leave-one-out (LOO) scheme. We also compared the performance of the different feature selection methods using the receiver operating characteristic analysis and the areas under the curve (AUCs). The AUCs for the F-score, Relief, SVM-RFE, SVM-RFE (mRMR), and SRN methods are 0.9014, 0.8916, 0.9121, 0.9236, and 0.9439, respectively, with a tenfold cross-validation procedure, and are 0.9312, 0.9178, 0.9324, 0.9413, and 0.9615, respectively, with a LOO scheme. Both the accuracy and AUC values show that the proposed SRN feature selection method has the best performance. In addition to the accuracy and the AUC, we also measured the significance between the two best feature selection methods, i.e., the SVM-RFE (mRMR) and the proposed SRN method. Experimental results show that the proposed SRN method is significantly more accurate than the SVM-RFE (mRMR) (p = 0.011).
Keywords :
cancer; feature extraction; feature selection; image classification; mammography; medical image processing; pattern clustering; support vector machines; F-score; FCM clustering; NMIFS filter; ROI; Relief; SVM classifiers; SVM-RFE procedure; SVM-based feature selection method; SVM-based recursive feature elimination; areas under the curve; benign masses; breast cancer; digital database; feature selection experiments; feature selection methods; fuzzy c-means; geometry features; initialization methods; leave-one-out scheme; linear discriminant analysis; mRMR filter; malignant masses; mammograms; mass classification results; mass segmentation; minimum redundancy-maximum relevance; normalized mutual information feature selection; receiver operating characteristic analysis; regions of interest; screening mammography; singular disadvantages; support vector machine; tenfold cross-validation procedure; texture features; Breast cancer; Feature extraction; Geometry; Image segmentation; Redundancy; Support vector machines; Breast cancer; feature selection; mammogram; mass classification; mutual information (MI); recursive feature elimination (RFE); support vector machine (SVM);
fLanguage :
English
Journal_Title :
Systems Journal, IEEE
Publisher :
ieee
ISSN :
1932-8184
Type :
jour
DOI :
10.1109/JSYST.2013.2286539
Filename :
6663666
Link To Document :
بازگشت