• DocumentCode
    2497985
  • Title

    Noise reduction to text categorization based on density for KNN

  • Author

    Li, Rong-lu ; Hu, Yun-fa

  • Author_Institution
    Comput. Technol. & Inf. Dept., Fudan Univ., Shanghai, China
  • Volume
    5
  • fYear
    2003
  • fDate
    2-5 Nov. 2003
  • Firstpage
    3119
  • Abstract
    With the rapid development of World Wide Web, text classification has become the key technology in organizing and processing large amount of document data. As a simple and effective classification approach, KNN method is widely used in text categorization. But KNN classifier not only has the large computational demands, but also may result in the decrease of precision of classification because of uneven density of training data. In this paper, we present a density-based method for reducing the noises of training data, which solves these problems. Our experiment results also illustrate it.
  • Keywords
    classification; information retrieval; learning (artificial intelligence); text analysis; KNN classifier; KNN method; World Wide Web; density based method; document data; k-nearest neighbor classifier; noise reduction; text categorization; text classification; training data; Artificial intelligence; Electronic mail; Machine learning; Natural languages; Noise reduction; Organizing; Runtime; Text categorization; Training data; Web sites;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Machine Learning and Cybernetics, 2003 International Conference on
  • Print_ISBN
    0-7803-8131-9
  • Type

    conf

  • DOI
    10.1109/ICMLC.2003.1260115
  • Filename
    1260115