DocumentCode
921674
Title
Improve Computer-Aided Diagnosis With Machine Learning Techniques Using Undiagnosed Samples
Author
Li, Ming ; Zhou, Zhi-Hua
Author_Institution
Nanjing Univ., Nanjing
Volume
37
Issue
6
fYear
2007
Firstpage
1088
Lastpage
1098
Abstract
In computer-aided diagnosis (CAD), machine learning techniques have been widely applied to learn a hypothesis from diagnosed samples to assist the medical experts in making a diagnosis. To learn a well-performed hypothesis, a large amount of diagnosed samples are required. Although the samples can be easily collected from routine medical examinations, it is usually impossible for medical experts to make a diagnosis for each of the collected samples. If a hypothesis could be learned in the presence of a large amount of undiagnosed samples, the heavy burden on the medical experts could be released. In this paper, a new semisupervised learning algorithm named Co-Forest is proposed. It extends the co-training paradigm by using a well-known ensemble method named Random Forest, which enables Co-Forest to estimate the labeling confidence of undiagnosed samples and easily produce the final hypothesis. Experiments on benchmark data sets verify the effectiveness of the proposed algorithm. Case studies on three medical data sets and a successful application to microcalcification detection for breast cancer diagnosis show that undiagnosed samples are helpful in building CAD systems, and Co-Forest is able to enhance the performance of the hypothesis that is learned on only a small amount of diagnosed samples by utilizing the available undiagnosed samples.
Keywords
cancer; gynaecology; learning (artificial intelligence); medical diagnostic computing; medical expert systems; pattern clustering; tumours; breast cancer diagnosis; co-forest semi supervised learning algorithm; computer-aided diagnosis; machine learning technique; medical data sets; medical expert system; microcalcification cluster detection; random forest ensemble method; routine medical examination; undiagnosed samples; Breast cancer; Cancer detection; Computer aided diagnosis; Labeling; Machine learning; Machine learning algorithms; Medical diagnostic imaging; Semisupervised learning; Supervised learning; Technological innovation; Computer-aided diagnosis (CAD); co-training; ensemble learning; machine learning; microcalcification cluster detection; random forest; semisupervised learning;
fLanguage
English
Journal_Title
Systems, Man and Cybernetics, Part A: Systems and Humans, IEEE Transactions on
Publisher
ieee
ISSN
1083-4427
Type
jour
DOI
10.1109/TSMCA.2007.904745
Filename
4342802
Link To Document