DocumentCode :
604550
Title :
Text feature extraction based on joint conditional entropy
Author :
Yanmin Chen ; Xinwei Wang
Author_Institution :
Dept. of Comput. Sci. & Technol., East China Normal Univ., Shanghai, China
fYear :
2012
fDate :
29-31 Dec. 2012
Firstpage :
2055
Lastpage :
2058
Abstract :
It is an important task for data mining and summarizing to extracting features of data. The task of extracting text feature is to extract useful information from texts with identifying and exploring interested patterns. We propose a strategy to extracting feature based on joint conditional entropy and genetic algorithm. Joint conditional entropy is the uncertainty measure of a set of variables given conditions. It is used to get the feature words which represent texts. Genetic algorithm has been applied successfully in many fields. The algorithm is useful for obtaining solutions of optimizing search problems. In this paper, we firstly preprocess texts in order to get the words, then, present the joint conditional entropy which can be applied to define the fitness function of genetic algorithm for discovering proper words which can represent texts. Finally, experimental result shows that this approach is suitable for extracting ideal features of text.
Keywords :
data mining; entropy; feature extraction; genetic algorithms; text analysis; data mining; fitness function; genetic algorithm; joint conditional entropy; text feature extraction; useful information; genetic algorithm; joint condition entropy; text mining;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Science and Network Technology (ICCSNT), 2012 2nd International Conference on
Conference_Location :
Changchun
Print_ISBN :
978-1-4673-2963-7
Type :
conf
DOI :
10.1109/ICCSNT.2012.6526323
Filename :
6526323
Link To Document :
بازگشت