DocumentCode :
506911
Title :
A Feature Selection Algorithm Based on Poisson Estimates
Author :
Gao, Yingfan ; Wang Hui-lin
Author_Institution :
Inst. of Sci. & Tech. Inf. of China, Beijing, China
Volume :
1
fYear :
2009
fDate :
14-16 Aug. 2009
Firstpage :
13
Lastpage :
18
Abstract :
Feature selection is one of the key technologies for text categorization. Currently, it mainly includes technologies based statistics which is primarily from information theory and technologies based semantics which covers natural language processing, semantic Web etc.. Based on Poisson hypothesis, this article presents a new method combining both and tries to find features in documents with more semantic information. The contrast experiments carried on the Reuters-21578 corpus with the IG, Chi2 and WN algorithms show that this method has more advantages than other algorithms.
Keywords :
natural language processing; stochastic processes; text analysis; Poisson estimates; Poisson hypothesis; Reuters-21578 corpus; feature selection algorithm; information theory; natural language processing; semantic Web; text categorization; Frequency shift keying; Fuzzy systems; H infinity control; Information theory; Mutual information; Natural language processing; Random variables; Semantic Web; Statistics; Text categorization; Feature selection from categories; Poisson Hypothesis; Semantic Feature;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Fuzzy Systems and Knowledge Discovery, 2009. FSKD '09. Sixth International Conference on
Conference_Location :
Tianjin
Print_ISBN :
978-0-7695-3735-1
Type :
conf
DOI :
10.1109/FSKD.2009.712
Filename :
5358669
Link To Document :
بازگشت