Title :
Application of Modified Genetic Algorithm in Feature Extraction of the Unstructured Data
Author :
Du, Nan ; Peng, Hong ; Zhang, Wenfeng
Author_Institution :
Sch. of Comput. Software Eng., South China Univ. of Technol., Guangzhou
Abstract :
Since unstructured datapsilas quantity is huge and the form is not disunity, thus the extent mining algorithms are hard to mine them. This paper supposes a modified genetic algorithm which extracts the feature from the unstructured data in a high efficiency and makes the further mining conveniently. First of all, it discusses the characteristic of the unstructured data, and then introduces the preprocess of it such as word segmentation, establishing the stop words table and feature extraction. Furthermore, the modified genetic algorithmpsilas operation such as selection,crossover as well as mutation is presented. Finally, the test of result of this modified genetic algorithm is shown, and the test result has proven that the algorithm is effective.
Keywords :
data mining; document handling; feature extraction; genetic algorithms; extent mining algorithms; feature extraction; modified genetic algorithm; stop words table; unstructured data; word segmentation; Application software; Computer applications; Data mining; Dictionaries; Feature extraction; File servers; Frequency conversion; Genetic algorithms; Testing; Vocabulary; Feature extraction; Genetic Algorithm; Text Mining; Unstructured Data;
Conference_Titel :
Advanced Computer Control, 2009. ICACC '09. International Conference on
Conference_Location :
Singapore
Print_ISBN :
978-1-4244-3330-8
DOI :
10.1109/ICACC.2009.65