• DocumentCode
    921674
  • Title

    Improve Computer-Aided Diagnosis With Machine Learning Techniques Using Undiagnosed Samples

  • Author

    Li, Ming ; Zhou, Zhi-Hua

  • Author_Institution
    Nanjing Univ., Nanjing
  • Volume
    37
  • Issue
    6
  • fYear
    2007
  • Firstpage
    1088
  • Lastpage
    1098
  • Abstract
    In computer-aided diagnosis (CAD), machine learning techniques have been widely applied to learn a hypothesis from diagnosed samples to assist the medical experts in making a diagnosis. To learn a well-performed hypothesis, a large amount of diagnosed samples are required. Although the samples can be easily collected from routine medical examinations, it is usually impossible for medical experts to make a diagnosis for each of the collected samples. If a hypothesis could be learned in the presence of a large amount of undiagnosed samples, the heavy burden on the medical experts could be released. In this paper, a new semisupervised learning algorithm named Co-Forest is proposed. It extends the co-training paradigm by using a well-known ensemble method named Random Forest, which enables Co-Forest to estimate the labeling confidence of undiagnosed samples and easily produce the final hypothesis. Experiments on benchmark data sets verify the effectiveness of the proposed algorithm. Case studies on three medical data sets and a successful application to microcalcification detection for breast cancer diagnosis show that undiagnosed samples are helpful in building CAD systems, and Co-Forest is able to enhance the performance of the hypothesis that is learned on only a small amount of diagnosed samples by utilizing the available undiagnosed samples.
  • Keywords
    cancer; gynaecology; learning (artificial intelligence); medical diagnostic computing; medical expert systems; pattern clustering; tumours; breast cancer diagnosis; co-forest semi supervised learning algorithm; computer-aided diagnosis; machine learning technique; medical data sets; medical expert system; microcalcification cluster detection; random forest ensemble method; routine medical examination; undiagnosed samples; Breast cancer; Cancer detection; Computer aided diagnosis; Labeling; Machine learning; Machine learning algorithms; Medical diagnostic imaging; Semisupervised learning; Supervised learning; Technological innovation; Computer-aided diagnosis (CAD); co-training; ensemble learning; machine learning; microcalcification cluster detection; random forest; semisupervised learning;
  • fLanguage
    English
  • Journal_Title
    Systems, Man and Cybernetics, Part A: Systems and Humans, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1083-4427
  • Type

    jour

  • DOI
    10.1109/TSMCA.2007.904745
  • Filename
    4342802