DocumentCode :
2761009
Title :
A new approach for text feature selection based on OWA operator
Author :
Ghaderi, Mohammad Ali ; Yazdani, Nasser ; Moshiri, Behzad ; Mahmoudi, Maryam Tayefeh
Author_Institution :
Central & Intell. Process. Center of Excellence, Univ. of Tehran, Tehran, Iran
fYear :
2010
fDate :
4-6 Dec. 2010
Firstpage :
579
Lastpage :
583
Abstract :
Feature selection has a significant role in the precision of text classification algorithms. In this regard, various approaches exist such as information Gain, Chi Square, Document Frequency, Mutual Information, etc. To improve the classification effectiveness combination of some input features may help a lot. In this paper, a new approach based on Ordered-Weighted Averaging (OWA) is proposed for combining two feature set selection algorithms named Information Gain (IG) and Chi-Square(X2). The proposed approach is applied on the dataset of Reuters-21578. Obtained results show that OWA operator in general outperforms averaging and maximizing operators which have been used before in the text classification field. To evaluate the capability of OWA in comparison with averaging and maximizing operators, micro-averaged F1 and macro-averaged F1 measures are used.
Keywords :
pattern classification; text analysis; Reuters-21578 data; chi-square selection algorithm; information gain selection algorithm; macroaveraged F1 measurement; microaveraged F1 measurement; ordered-weighted averaging operator; text classification algorithms; text feature selection; Artificial neural networks; Classification algorithms; Machine learning; Machine learning algorithms; Open wireless architecture; Support vector machine classification; Text categorization; Feature selection; OWA; combination of features; data fusion; text classification;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Telecommunications (IST), 2010 5th International Symposium on
Conference_Location :
Tehran
Print_ISBN :
978-1-4244-8183-5
Type :
conf
DOI :
10.1109/ISTEL.2010.5734091
Filename :
5734091
Link To Document :
بازگشت