DocumentCode
3362701
Title
An Improved Ambiguity Measure Feature Selection for Text Categorization
Author
Liu, Zhiying ; Yang, Jieming
Author_Institution
Coll. of Inf. Eng., Northeast Dianli Univ., Jilin, China
Volume
1
fYear
2012
fDate
26-27 Aug. 2012
Firstpage
220
Lastpage
223
Abstract
The high dimensionality of the text categorization raises big hurdles in applying many sophisticated learning algorithms to the text categorization. Feature selection, which reduces the number of features that represent documents, is an absolute requirement in text categorization. In this paper, we proposed a feature selection method, which improved the performance of the Ambiguity Measure feature selection. We compare the proposed method with four feature selections (Information Gain, Ambiguity Measure, Odd Ratios and Mutual Information) using two classification algorithms (Naïve Bayes and Support Vector Machines) on three datasets (20-newgroups, Reuters-21578 and WebKB). The experiments show that the proposed method is significantly better than AM and MI, and achieves comparable performance with IG and OR.
Keywords
Bayes methods; learning (artificial intelligence); pattern classification; support vector machines; text analysis; AM; IG; MI; OR; ambiguity measure feature selection; classification algorithms; feature selection method; naïve Bayes; sophisticated learning algorithms; support vector machines; text categorization; Accuracy; Algorithm design and analysis; Classification algorithms; Mutual information; Support vector machines; Text categorization; Training; dimensionally reduction; feature selection; text categorization;
fLanguage
English
Publisher
ieee
Conference_Titel
Intelligent Human-Machine Systems and Cybernetics (IHMSC), 2012 4th International Conference on
Conference_Location
Nanchang, Jiangxi
Print_ISBN
978-1-4673-1902-7
Type
conf
DOI
10.1109/IHMSC.2012.62
Filename
6305666
Link To Document