Title of article :
A Comparative Study of Multilayer Neural Network and C4.5 Decision Tree Models for Predicting the Risk of Breast Cancer
Author/Authors :
Sohrabi, Soolmaz Shahid Beheshti University of Medical Sciences - Department of Medical Informatics, Tehran , Atashi, Alireza Department of E-Health - Virtual School - Tehran University of Medical Sciences, Tehran , Dadashi, Ali Mashhad University of Medical Sciences - Department Of Medical Informatics, Mashhad , Marashi, Sina Department of E-Health - Virtual School - Tehran University of Medical Sciences, Tehran
Abstract :
Background: Diagnosing breast cancer at an early stage can have a great impact
on cancer mortality. One of the fundamental problems in cancer treatment is the lack
of a proper method for early detection, which may lead to diagnostic errors. Using
data analysis techniques can significantly help in early diagnosis of the disease. The
purpose of this study was to evaluate and compare the efficacy of two data mining
techniques, i.e., multilayer neural network and C4.5, in early diagnosis of breast
cancer.
Methods: A data set from Motamed Cancer Institute's breast cancer research
clinic, Tehran, containing 2860 records related to breast cancer risk factors were
used. Of the records, 1141 (40%) were related to malignant changes and breast
cancer and 1719 (60%) to benign tumors. The data set was analyzed using
perceptron neural network and decision tree algorithms, and was split into two a
training data set (70%) and a testing data set (30%) using Rapid Miner 5.2.
Results: For neural networks, accuracy was 80.52%, precision 88.91%, and
sensitivity 90.88%; and for decision tree, accuracy was 80.98%, precision 80.97%,
and sensitivity 89.32%. Results indicated that both algorithms have acceptable
capabilities for analyzing breast cancer data.
Conclusion: Although both models provided good results, neural network
showed more reliable diagnosis for positive cases. Data set type and analysis
method affect results. On the other hand, information about more powerful risk
factors of breast cancer, such as genetic mutations, can provide models with high
coverage.
Keywords :
Decision tree , multilayer neural network , breast cancer , data analysis
Journal title :
Astroparticle Physics