• DocumentCode
    2550427
  • Title

    Text classification using multi-word features

  • Author

    Zhang, Wen ; Yoshida, Taketoshi ; Tang, Xijin

  • Author_Institution
    Japan Adv. Inst. of Sci. & Technol., Ishikawa
  • fYear
    2007
  • fDate
    7-10 Oct. 2007
  • Firstpage
    3519
  • Lastpage
    3524
  • Abstract
    We carried out a series of experiments on text classification using multi-word features. An automated method was proposed to extract the multi-words from text data set and two different strategies were developed to normalize the multi-words into two different versions of multi-word features. After the texts were represented respectively using these two different multi-word features, text classification was conducted in contrast to examine the effectiveness of these two strategies. Also the linear and nonlinear polynomial kernel of support vector machine (SVM) was compared on the performance of text classification task.
  • Keywords
    feature extraction; pattern classification; text analysis; multi word feature extraction; nonlinear polynomial kernel; support vector machine; text classification; text dataset; Data mining; Data preprocessing; Feature extraction; Kernel; Logic; Ontologies; Personnel; Support vector machine classification; Support vector machines; Text categorization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Systems, Man and Cybernetics, 2007. ISIC. IEEE International Conference on
  • Conference_Location
    Montreal, Que.
  • Print_ISBN
    978-1-4244-0990-7
  • Electronic_ISBN
    978-1-4244-0991-4
  • Type

    conf

  • DOI
    10.1109/ICSMC.2007.4414208
  • Filename
    4414208