DocumentCode :
2502841
Title :
Decision Tree Based Predictive Models for Breast Cancer Survivability on Imbalanced Data
Author :
Liu Ya-Qin ; Wang Cheng ; Zhang Lu
Author_Institution :
Dept. of Biomed. Eng., Shanghai JiaoTong Univ., Shanghai, China
fYear :
2009
fDate :
11-13 June 2009
Firstpage :
1
Lastpage :
4
Abstract :
Based on imbalanced data, the predictive models for 5-year survivability of breast cancer using decision tree are proposed. After data preprocessing from SEER breast cancer datasets, it is obviously that the category of data distribution is imbalanced. Under-sampling is taken to make up the disadvantage of the performance of models caused by the imbalanced data. The performance of the models is evaluated by AUC under ROC curve, accuracy, specificity and sensitivity with 10-fold stratified cross-validation. The performance of models is best while the distribution of data is approximately equal. Bagging algorithm is used to build an integration decision tree model for predicting breast cancer survivability.
Keywords :
biological organs; cancer; data mining; decision trees; gynaecology; medical computing; prediction theory; sampling methods; tumours; AUC; ROC curve; bagging algorithm; breast cancer survivability; data distribution; data mining; data preprocessing; decision tree; imbalanced data analysis; predictive model; under-sampling method; Bagging; Biomedical engineering; Breast cancer; Cleaning; Data mining; Data preprocessing; Decision trees; Dictionaries; Predictive models; Sensitivity;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Bioinformatics and Biomedical Engineering , 2009. ICBBE 2009. 3rd International Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4244-2901-1
Electronic_ISBN :
978-1-4244-2902-8
Type :
conf
DOI :
10.1109/ICBBE.2009.5162571
Filename :
5162571
Link To Document :
بازگشت