DocumentCode
1927006
Title
Integrating Incremental Feature Weighting into NaÃ\x8fve Bayes Text Classifier
Author
Kim, Han Joon ; Chang, Jaeyoung
Author_Institution
Seoul Univ., Seoul
Volume
2
fYear
2007
fDate
19-22 Aug. 2007
Firstpage
1137
Lastpage
1143
Abstract
In the real-world operational environment, text classification systems should handle the problem of incomplete training set and no prior knowledge of feature space. In this regard, the most appropriate algorithm for operational text classification is the naive Bayes since it is easy to incrementally update its pre-learned classification model and feature space. Our work mainly focuses on improving naive Bayes classifier through feature weighting strategy. The basic idea is that parameter estimation of naive Bayes can consider the degree of feature importance as well as feature distribution. In addition, we have extended a conventional algorithm for incremental feature update for developing a dynamic feature space in operational environment. Through experiments using the Reuters-21578 and the 20 Newsgroup benchmark collections, we show that the traditional multinomial naive Bayes classifier can be significantly improved by chi2-statistic based feature weighting.
Keywords
Bayes methods; classification; feature extraction; learning (artificial intelligence); text analysis; dynamic feature space; incomplete training set; incremental feature weighting; naive Bayes text classification systems; operational environment; parameter estimation; pre-learned classification model; Cybernetics; Electronic mail; IP networks; Knowledge engineering; Machine learning; Parameter estimation; Software libraries; Statistics; Text categorization; Web pages; Feature selection; Feature weighting; Naïve Bayes classifier; Text classification; ¿2-statistic;
fLanguage
English
Publisher
ieee
Conference_Titel
Machine Learning and Cybernetics, 2007 International Conference on
Conference_Location
Hong Kong
Print_ISBN
978-1-4244-0973-0
Electronic_ISBN
978-1-4244-0973-0
Type
conf
DOI
10.1109/ICMLC.2007.4370315
Filename
4370315
Link To Document