DocumentCode :
2728745
Title :
Text Feature Ranking Based on Rough-set Theory
Author :
Tan, Songbo ; Wang, Yuefen ; Cheng, Xueqi
Author_Institution :
Chinese Acad. of Sci., Beijing
fYear :
2007
fDate :
2-5 Nov. 2007
Firstpage :
659
Lastpage :
662
Abstract :
With the aim to reduce the dimensionality without sacrificing classification performance, the author gains insights from attribute reduction based on discernibility matrix in rough-set theory and proposes two text feature selection algorithms, i.e., DB1 and DB2. The experimental results indicate that DB2 not only yields much higher accuracy than information gain when the number of features is smaller than 6000, but also incurs much smaller CPU time than information gain.
Keywords :
rough set theory; text analysis; attribute reduction; discernibility matrix; information gain; rough-set theory; text feature ranking; text feature selection algorithm; Classification algorithms; Computers; Feature extraction; Frequency; Geology; Iron; Performance gain; Symmetric matrices; Text categorization; Vocabulary;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Web Intelligence, IEEE/WIC/ACM International Conference on
Conference_Location :
Fremont, CA
Print_ISBN :
978-0-7695-3026-0
Type :
conf
DOI :
10.1109/WI.2007.31
Filename :
4427168
Link To Document :
بازگشت