• DocumentCode
    2168484
  • Title

    Using association features to enhance the performance of Naive Bayes text classifier

  • Author

    Yang, Zhang ; Lijun, Zhang ; Jianfeng, Yan ; Zhanhuai, Li

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Northwestern Polytech. Univ., China
  • fYear
    2003
  • fDate
    27-30 Sept. 2003
  • Firstpage
    336
  • Lastpage
    341
  • Abstract
    The co-occurrence of words can make contributions to automatic text classification. However, this information cannot be represented in the feature set when only using primitive features, and can only be partially represented when using n-grams as features. In this paper, we define a novel feature, association feature, to describe this information. In order to make the association features which we selected to be good discriminators, we proposed an approach to create association feature set, including redundancy pruning algorithm and feature selection algorithm. The experiment result shows that the performance of Naive Bayes text classifier could be improved by using association features, which also means that the selected set of association features can make more contributions to text classification than primitive features, and n-grams.
  • Keywords
    Bayes methods; character recognition; data mining; text analysis; Naive Bayes text classifier; association feature; automatic text classification; n-grams; redundancy pruning; Classification tree analysis; Computer science; Data mining; Decision trees; Feature extraction; Information analysis; Itemsets; Space technology; Text categorization; Text mining;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computational Intelligence and Multimedia Applications, 2003. ICCIMA 2003. Proceedings. Fifth International Conference on
  • Print_ISBN
    0-7695-1957-1
  • Type

    conf

  • DOI
    10.1109/ICCIMA.2003.1238148
  • Filename
    1238148