DocumentCode :
3297075
Title :
Reduction of Variables for Predicting Breast Cancer Survivability Using Principal Component Analysis
Author :
Hussain, Sharaf ; Quazilbash, Naveen Zehra ; Bai, Samita ; Khoja, Shakeel
Author_Institution :
Fac. of Comput. Sci., Inst. of Bus. Adm., Karachi, Pakistan
fYear :
2015
fDate :
22-25 June 2015
Firstpage :
131
Lastpage :
134
Abstract :
This research uses breast cancer data from the Surveillance, Epidemiology, and End Results (SEER) dataset´s (1973-2010), which contains 684394 records. It is cleaned using several data pre-processing techniques. Survivability predictions are proposed using two different methods. In the first method, 14 variables are used as suggested by Delen et al[1], and in second method 14 variables are reduced to 5 variables (Principal Components) using a statistical technique called Principal Component Analysis (PCA), which captures 98% of total variance. The results of both of the methods propose almost same level of accuracy, thereby reducing the number of variables to be taken into account for the analysis of data.
Keywords :
cancer; data analysis; patient treatment; principal component analysis; tumours; SEER dataset; Surveillance Epidemiology and End Results dataset; breast cancer data; breast cancer survivability prediction; data preprocessing techniques; principal component analysis; variable reduction; Accuracy; Breast cancer; Data mining; Decision trees; Predictive models; Principal component analysis; Variable reduction; breast cancer; principal component analysis; seer dataset;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer-Based Medical Systems (CBMS), 2015 IEEE 28th International Symposium on
Conference_Location :
Sao Carlos
Type :
conf
DOI :
10.1109/CBMS.2015.62
Filename :
7167472
Link To Document :
بازگشت