• DocumentCode
    1571934
  • Title

    An Extensive Empirical Study of Feature Selection for Text Categorization

  • Author

    Qiu, Li-Qing ; Zhao, Ru-Yi ; Zhou, Gang ; Yi, Sheng-Wei

  • Author_Institution
    State Key Lab. of Software Dev. Environ., Beihang Univ., Beihang
  • fYear
    2008
  • Firstpage
    312
  • Lastpage
    315
  • Abstract
    We present a novel feature selection (FS) approach for text categorization. It first constructs a local feature set for each category by selecting a set of features based on three different schemes: DF, TF and TFIDF, and then constructs a global feature set utilizing well-known CHI method based on the local feature set. The experimental comparison is carried out between our method and CHI method. Results from the experiments are summarized. The results show that our proposed method based on DF scheme can perform comparatively well with CHI methods.
  • Keywords
    learning (artificial intelligence); pattern classification; text analysis; CHI methods; feature selection; local feature set; text categorization; Character generation; Computational efficiency; Frequency measurement; Gain measurement; Information analysis; Information science; Performance gain; Programming; Space technology; Text categorization; Text Categorization; feature selection;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer and Information Science, 2008. ICIS 08. Seventh IEEE/ACIS International Conference on
  • Conference_Location
    Portland, OR
  • Print_ISBN
    978-0-7695-3131-1
  • Type

    conf

  • DOI
    10.1109/ICIS.2008.49
  • Filename
    4529838