DocumentCode
2068590
Title
A new feature selection method based on distributional information for Text Classification
Author
Shi, Nianyun ; Liu, Lingling
Author_Institution
Coll. of Comput. & Commun. Eng., China Univ. of Pet. (East China), Dongying, China
Volume
1
fYear
2010
fDate
10-12 Dec. 2010
Firstpage
190
Lastpage
194
Abstract
Feature Selection (FS) is one of the most important issues in Text Classification (TC). A good feature selection can improve the efficiency and accuracy of a text classifier. Based on the analysis of the feature´s distributional information, this paper presents a feature selection method named DIFS. In DIFS a new estimation mechanism is proposed to measure the relevance between feature´s distribution characteristics and contribution to categorization. In addition, two kinds of algorithms are designed to implement DIFS. Experiments are carried out on a Chinese corpus and by comparison the proposed approach shows a better performance.
Keywords
classification; estimation theory; natural language processing; text analysis; Chinese corpus; DIFS; distributional information; estimation mechanism; feature selection method; text classification; text classifier; Estimation; Text categorization; Distributional Information; Feature Selection (FS); Text Classification(TC);
fLanguage
English
Publisher
ieee
Conference_Titel
Progress in Informatics and Computing (PIC), 2010 IEEE International Conference on
Conference_Location
Shanghai
Print_ISBN
978-1-4244-6788-4
Type
conf
DOI
10.1109/PIC.2010.5687404
Filename
5687404
Link To Document