• DocumentCode
    2223455
  • Title

    MSVM-kNN: Combining SVM and k-NN for Multi-class Text Classification

  • Author

    Yuan, Pingpeng ; Chen, Yuqin ; Jin, Hai ; Huang, Li

  • Author_Institution
    Service Comput. Technol. & Syst. Lab., Huazhong Univ. of Sci. & Technol., Wuhan
  • fYear
    2008
  • fDate
    14-15 July 2008
  • Firstpage
    133
  • Lastpage
    140
  • Abstract
    Text categorization is the process of assigning documents to a set of previously fixed categories. It is widely used in many data-oriented management applications. Many popular algorithms for text categorization have been proposed, such as Naive Bayes, k-Nearest Neighbor (k-NN), Support Vector Machine (SVM). However, those classification approaches do not perform well in every case, for example, SVM can not identify categories of documents correctly when the texts are in cross zones of multi-categories, k-NN cannot effectively solve the problem of overlapped categories borders. In this paper, we propose an approach named as Multi-class SVM-kNN (MSVM-kNN) which is the combination of SVM and k-NN. In the approach, SVM is first used to identify category borders, then k-NN classifies documents among borders. MSVM-kNN can overcome the shortcomings of SVM and k-NN and improve the performance of multi-class text classification. The experimental results show MSVM-kNN performs better than SVM or kNN.
  • Keywords
    classification; support vector machines; text analysis; data-oriented management application; document assignment; k-nearest neighbor; multiclass text classification; support vector machine; text categorization; Application software; Classification tree analysis; Computer science; Conferences; Decision trees; Grid computing; Neural networks; Support vector machine classification; Support vector machines; Text categorization; SVM; Text Categorization; kNN;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Semantic Computing and Systems, 2008. WSCS '08. IEEE International Workshop on
  • Conference_Location
    Huangshan
  • Print_ISBN
    978-0-7695-3316-2
  • Electronic_ISBN
    978-0-7695-3316-2
  • Type

    conf

  • DOI
    10.1109/WSCS.2008.36
  • Filename
    4570829