DocumentCode :
2461947
Title :
A comparison of non-symmetric entropy-based classification trees and support vector machine for cardiovascular risk stratification
Author :
Singh, Anima ; Guttag, John V.
Author_Institution :
Department of Electrical Engineering and Computer Science at Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, MA 02139, USA
fYear :
2011
fDate :
Aug. 30 2011-Sept. 3 2011
Firstpage :
79
Lastpage :
82
Abstract :
Classification tree-based risk stratification models generate easily interpretable classification rules. This feature makes classification tree-based models appealing for use in a clinical setting, provided that they have comparable accuracy to other methods. In this paper, we present and evaluate the performance of a non-symmetric entropy-based classification tree algorithm. The algorithm is designed to accommodate class imbalance found in many medical datasets. We evaluate the performance of this algorithm, and compare it to that of SVM-based classifiers, when applied to 4219 non-ST elevation acute coronary syndrome patients. We generated SVM-based classifiers using three different strategies for handling class imbalance: cost-sensitive SVM learning, synthetic minority oversampling (SMOTE), and random majority undersampling. We used both linear and radial basis kernel-based SVMs. Our classification tree models outperformed SVM-based classifiers generated using each of the three techniques. On average, the classification tree models yielded a 14% improvement in G-score and a 21% improvement in F-score relative to the linear SVM classifiers with the best performance. Similarly, our classification tree models yielded a 12% improvement in G-score and a 21% improvement in the F-score over the best RBF kernel-based SVM classifiers.
Keywords :
Algorithm design and analysis; Entropy; History; Kernel; Machine learning; Support vector machines; Training; Acute Coronary Syndrome; Diagnosis, Computer-Assisted; Entropy; Humans; Pattern Recognition, Automated; Prevalence; Proportional Hazards Models; Reproducibility of Results; Risk Assessment; Risk Factors; Sensitivity and Specificity; Support Vector Machines; Survival Analysis; Survival Rate;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Engineering in Medicine and Biology Society, EMBC, 2011 Annual International Conference of the IEEE
Conference_Location :
Boston, MA
ISSN :
1557-170X
Print_ISBN :
978-1-4244-4121-1
Electronic_ISBN :
1557-170X
Type :
conf
DOI :
10.1109/IEMBS.2011.6089901
Filename :
6089901
Link To Document :
بازگشت