DocumentCode
259270
Title
Mining Relevant Text Features for Retrieving Web Information
Author
Pipanmaekaporn, Luepol ; Kamolsantiroj, Suwatchai
Author_Institution
Dept. of Comput. & Inf. Sci., King Mongkut´s Univ. of Technol. North Bangkok, Bangkok, Thailand
fYear
2014
fDate
Aug. 31 2014-Sept. 4 2014
Firstpage
447
Lastpage
452
Abstract
It is a big challenge to develop effective methods that can discover high quality and useful features in text documents. Most existing information retrieval and text mining methods focuses on term-based approach that often suffers from the problems of term variation and noise. This paper illustrates an innovative approach that discovers relevant knowledge to precisely describe text features for retrieving web information. In particular, it extracts precise text patterns by considering both relevant and irrelevant documents. Then, the discovered patterns are used to find accurate relevant features in a training set. The proposed approach has been evaluated through the implementation of a novel information filtering model and a comparative evaluation is conducted by invoking state-of-the-art models. The experimental results obtained based on the Reuters Corpus Volume 1 and TREC topics show that the proposed approach significantly outperforms the best baseline method.
Keywords
Internet; information filtering; pattern recognition; text analysis; Reuters corpus volume 1; TREC topics; Web information retrieval; information filtering; term-based approach; text documents; text feature mining; text patterns; Data collection; Feature extraction; Noise measurement; Support vector machines; Text mining; Training; Feature Extraction; Feature Selection; Relevance Feedback and Text Mining;
fLanguage
English
Publisher
ieee
Conference_Titel
Advanced Applied Informatics (IIAIAAI), 2014 IIAI 3rd International Conference on
Conference_Location
Kitakyushu
Print_ISBN
978-1-4799-4174-2
Type
conf
DOI
10.1109/IIAI-AAI.2014.96
Filename
6913340
Link To Document