Title of article :
Feature selection for text classification with Naïve Bayes
Author/Authors :
Chen، نويسنده , , Jingnian and Huang، نويسنده , , Houkuan and Tian، نويسنده , , Shengfeng and Qu، نويسنده , , Youli، نويسنده ,
Issue Information :
روزنامه با شماره پیاپی سال 2009
Pages :
4
From page :
5432
To page :
5435
Abstract :
As an important preprocessing technology in text classification, feature selection can improve the scalability, efficiency and accuracy of a text classifier. In general, a good feature selection method should consider domain and algorithm characteristics. As the Naïve Bayesian classifier is very simple and efficient and highly sensitive to feature selection, so the research of feature selection specially for it is significant. This paper presents two feature evaluation metrics for the Naïve Bayesian classifier applied on multi-class text datasets: Multi-class Odds Ratio (MOR), and Class Discriminating Measure (CDM). Experiments of text classification with Naïve Bayesian classifiers were carried out on two multi-class texts collections. As the results indicate, CDM and MOR gain obviously better selecting effect than other feature selection approaches.
Keywords :
naïve Bayes , Text classification , feature selection , Text preprocessing
Journal title :
Expert Systems with Applications
Serial Year :
2009
Journal title :
Expert Systems with Applications
Record number :
2345994
Link To Document :
بازگشت