Title :
Intelligent Information Extraction with Soft Matching Rules and Knowledge Discovery Using Genetic Algorithm for Text Mining
Author :
Christy, A. ; Thambidurai, P.
Author_Institution :
Sathyabama Univ., Chennai
Abstract :
The popularity of the Internet and the large number of documents available in electronic form in modern world has motivated the search for hidden knowledge in text collections. In this paper, we present Information Extraction using NLP technique combined with soft matching rules, which is then stored in the form of scenario template production. Once the information is extracted, we have used the genetic algorithm for classifying the information based upon their relevancy. We have compared the relevancy of genetic algorithm with traditional classification techniques. The system is tested using data collected from NSF Research abstracts and abstracts from two different domains of www.computer.org and we have found that the system has improved its recall value after the application of soft matching rules. Genetic algorithm is an effective classifier and is quite competitive with C4.8 method even though the concept increases in complexity.
Keywords :
data mining; feature extraction; genetic algorithms; natural language processing; text analysis; C4.8 method; data collection; genetic algorithm; information extraction; intelligent information extraction; knowledge discovery; soft matching rules; text mining; Abstracts; Competitive intelligence; Computational intelligence; Data mining; Genetic algorithms; Hidden Markov models; Natural language processing; Production; Spatial databases; Text mining;
Conference_Titel :
Conference on Computational Intelligence and Multimedia Applications, 2007. International Conference on
Conference_Location :
Sivakasi, Tamil Nadu
Print_ISBN :
0-7695-3050-8
DOI :
10.1109/ICCIMA.2007.350