Title :
Analysis of breast cancer using data mining & statistical techniques
Author :
Xiong, Xiangchun ; Kim, Yangon ; Baek, Yuncheol ; Rhee, Dae Wong ; Kim, Soo-Hong
Author_Institution :
Comput. & Inf. Sci., Towson Univ., MD, USA
Abstract :
Data mining & statistics analysis is the search for valuable information in large volumes of data. It is now widely used in health care industry. Especially breast cancer is the second most cause of cancer and the second most dangerous cancer. The best way to improve a breast cancer victim´s chance of long-term survival is to detect it as early as possible. Currently there are three methods to diagnose breast cancer: mammography, FNA (fine needle aspirate) and surgical biopsy. The diagnose accuracy of mammography is from 68% to 79%, the accuracy of FNA is inconsistent with varying from 65% to 98%t the accuracy of a surgical biopsy is nearly 100%. The procedure of a surgical biopsy, however, is both unpleasant and costly. In this paper, we use a FNA with a data mining & statistics method to get an easy way to achieve a best result. We combine some statistical methods such as PCA, PLS linear regression analysis with data mining methods such as select attribute, decision trees and association rules to find the unsuspected relationships. In addition, the experimental results are shown and discussed.
Keywords :
cancer; data mining; decision trees; health care; principal component analysis; regression analysis; PLS linear regression analysis; association rules; breast cancer detection; data mining; decision trees; fine needle aspirate; health care industry; select attribute; statistics analysis; Biopsy; Breast cancer; Cancer detection; Data mining; Information analysis; Mammography; Medical services; Needles; Oncological surgery; Statistical analysis;
Conference_Titel :
Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing, 2005 and First ACIS International Workshop on Self-Assembling Wireless Networks. SNPD/SAWN 2005. Sixth International Conference on
Print_ISBN :
0-7695-2294-7
DOI :
10.1109/SNPD-SAWN.2005.19