• DocumentCode
    1927006
  • Title

    Integrating Incremental Feature Weighting into NaÃ\x8fve Bayes Text Classifier

  • Author

    Kim, Han Joon ; Chang, Jaeyoung

  • Author_Institution
    Seoul Univ., Seoul
  • Volume
    2
  • fYear
    2007
  • fDate
    19-22 Aug. 2007
  • Firstpage
    1137
  • Lastpage
    1143
  • Abstract
    In the real-world operational environment, text classification systems should handle the problem of incomplete training set and no prior knowledge of feature space. In this regard, the most appropriate algorithm for operational text classification is the naive Bayes since it is easy to incrementally update its pre-learned classification model and feature space. Our work mainly focuses on improving naive Bayes classifier through feature weighting strategy. The basic idea is that parameter estimation of naive Bayes can consider the degree of feature importance as well as feature distribution. In addition, we have extended a conventional algorithm for incremental feature update for developing a dynamic feature space in operational environment. Through experiments using the Reuters-21578 and the 20 Newsgroup benchmark collections, we show that the traditional multinomial naive Bayes classifier can be significantly improved by chi2-statistic based feature weighting.
  • Keywords
    Bayes methods; classification; feature extraction; learning (artificial intelligence); text analysis; dynamic feature space; incomplete training set; incremental feature weighting; naive Bayes text classification systems; operational environment; parameter estimation; pre-learned classification model; Cybernetics; Electronic mail; IP networks; Knowledge engineering; Machine learning; Parameter estimation; Software libraries; Statistics; Text categorization; Web pages; Feature selection; Feature weighting; Naïve Bayes classifier; Text classification; ¿2-statistic;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Machine Learning and Cybernetics, 2007 International Conference on
  • Conference_Location
    Hong Kong
  • Print_ISBN
    978-1-4244-0973-0
  • Electronic_ISBN
    978-1-4244-0973-0
  • Type

    conf

  • DOI
    10.1109/ICMLC.2007.4370315
  • Filename
    4370315