Title :
An efficient feature selection method using named entity recognition for Chinese text categorization
Author :
Liu, Bin ; Li, Chunping
Author_Institution :
Sch. of Software, Tsinghua Univ., Beijing, China
Abstract :
Feature selection is an important task for text categorization. Traditional feature selection methods are based on terms but they may lose some useful information in texts. In this paper, we present a feature selection method that considers not only general terms but also named entities. Corresponding to our feature selection method, we propose a term weighting scheme for named entities. The experiments show that our method is effective comparing with traditional methods.
Keywords :
text analysis; Chinese text categorization; feature selection; named entities; named entity recognition; term weighting scheme; Cybernetics; Machine learning; Text categorization; Text recognition; Text categorization; feature selection; named entity recognition; term weighting;
Conference_Titel :
Machine Learning and Cybernetics, 2009 International Conference on
Conference_Location :
Baoding
Print_ISBN :
978-1-4244-3702-3
Electronic_ISBN :
978-1-4244-3703-0
DOI :
10.1109/ICMLC.2009.5212749