DocumentCode :
3032794
Title :
The Effect of Combining Different Feature Selection Methods on Arabic Text Classification
Author :
Al-Thubaity, Abdulmohsen ; Abanumay, Norah ; Al-Jerayyed, Sara ; Alrukban, Aljoharah ; Mannaa, Zarah
Author_Institution :
Comput. Res. Inst., King Abdulaziz City for Sci. & Technol., Riyadh, Saudi Arabia
fYear :
2013
fDate :
1-3 July 2013
Firstpage :
211
Lastpage :
216
Abstract :
Feature selection is one of several factors affecting text classification systems. Feature selection aims to choose a representative subset of all features to reduce the complexity of classification problems. Usually a single method is used for feature selection. For English, several attempts were reported examining the combination of different feature selection methods. To the best of our knowledge no such attempts were reported for Arabic text classification. In this study, we examined the effect of combining five feature selection methods, namely CHI, IG, GSS, NGL and RS, on Arabic text classification accuracy. Two approaches of combination were used, intersection (AND) and union (OR). The NB classification algorithm was used to classify a Saudi Press Agency dataset which comprised 6,300 texts divided evenly into six classes. Three feature representation schemas were used, namely Boolean, TFiDF and LTC. The experiments show slight improvement in classification accuracy for combining two and three feature selection methods. No improvement on classification accuracy was seen when four or all five feature selection methods were combined.
Keywords :
classification; natural language processing; text analysis; Arabic text classification; Boolean; CHI; GSS; IG; LTC; NGL; RS; Saudi Press Agency dataset; TFiDF; feature selection methods; intersection combination; representative subset; union combination; Accuracy; Classification algorithms; Computers; Diversity reception; Educational institutions; Niobium; Text categorization; Arabic text classification; classification accuracy; classification algorithms; feature representation; feature selection;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD), 2013 14th ACIS International Conference on
Conference_Location :
Honolulu, HI
Type :
conf
DOI :
10.1109/SNPD.2013.89
Filename :
6598468
Link To Document :
بازگشت