• DocumentCode
    1571955
  • Title

    An Improved Condensing Algorithm

  • Author

    Hao, Xiulan ; Zhang, Chenghong ; Xu, Hexiang ; Tao, Xiaopeng ; Wang, Shuyun ; Hu, Yunfa

  • Author_Institution
    Dept. of CIT, Fudan Univ., Shanghai
  • fYear
    2008
  • Firstpage
    316
  • Lastpage
    321
  • Abstract
    kNN classifier is widely used in text categorization, however, kNN has the large computational and store requirements, and its performance also suffers from uneven distribution of training data. Usually, condensing technique is resorted to reducing the noises of training data and decreasing the cost of time and space. Traditional condensing technique picks up samples in a random manner when initialization. Though random sampling is one means to reduce outliers, the extremely stochastic may lead to bad performance sometimes, that is, advantages of sampling may be suppressed. To avoid such a misfortune, we propose a variation of traditional condensing technique. Experiment results illustrate this strategy can solve above problems effectively.
  • Keywords
    classification; learning (artificial intelligence); neural nets; sampling methods; text analysis; condensing algorithm; kNN classifier; outlier reduction; random sampling; text categorization; training data; Conference management; Costs; Distributed computing; Electronic mail; Information science; Management training; Noise reduction; Sampling methods; Text categorization; Training data; Condensing Algorithm; Selected Seeds; Text Categorization; kNN;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer and Information Science, 2008. ICIS 08. Seventh IEEE/ACIS International Conference on
  • Conference_Location
    Portland, OR
  • Print_ISBN
    978-0-7695-3131-1
  • Type

    conf

  • DOI
    10.1109/ICIS.2008.67
  • Filename
    4529839