• DocumentCode
    1583205
  • Title

    Building a Simple and Effective Text Categorization System using Relative Importance in Category

  • Author

    Yan, Bingheng ; Qian, Depei

  • Author_Institution
    Xi´´an Jiaotong Univ., Xi´´an
  • Volume
    1
  • fYear
    2007
  • Firstpage
    108
  • Lastpage
    114
  • Abstract
    With the rapid development of World Wide Web, text categorization has become the key technology in organizing and processing large volume of document data. There are a variety of text categorization methods such as k nearest neighbor (kNN) and support vector machine (SVM). However, those methods are either too complicated or not effective enough. In this paper, we present a new method called relative importance in category (RIIC), which is simpler than most methods and has a lower time complexity. To verify the performance of RIIC, we build a text categorization system (TCS) based on RIIC and compare our system with TCS based on kNN and SVM. Experimental results show that in most cases the performance of RIIC is better than kNN and SVM.
  • Keywords
    classification; text analysis; document data; relative importance in category; text categorization system; time complexity; Costs; Data engineering; Filters; Machine learning; Nearest neighbor searches; Organizing; Support vector machine classification; Support vector machines; Text categorization; Web sites;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Natural Computation, 2007. ICNC 2007. Third International Conference on
  • Conference_Location
    Haikou
  • Print_ISBN
    978-0-7695-2875-5
  • Type

    conf

  • DOI
    10.1109/ICNC.2007.289
  • Filename
    4344164