DocumentCode :
259270
Title :
Mining Relevant Text Features for Retrieving Web Information
Author :
Pipanmaekaporn, Luepol ; Kamolsantiroj, Suwatchai
Author_Institution :
Dept. of Comput. & Inf. Sci., King Mongkut´s Univ. of Technol. North Bangkok, Bangkok, Thailand
fYear :
2014
fDate :
Aug. 31 2014-Sept. 4 2014
Firstpage :
447
Lastpage :
452
Abstract :
It is a big challenge to develop effective methods that can discover high quality and useful features in text documents. Most existing information retrieval and text mining methods focuses on term-based approach that often suffers from the problems of term variation and noise. This paper illustrates an innovative approach that discovers relevant knowledge to precisely describe text features for retrieving web information. In particular, it extracts precise text patterns by considering both relevant and irrelevant documents. Then, the discovered patterns are used to find accurate relevant features in a training set. The proposed approach has been evaluated through the implementation of a novel information filtering model and a comparative evaluation is conducted by invoking state-of-the-art models. The experimental results obtained based on the Reuters Corpus Volume 1 and TREC topics show that the proposed approach significantly outperforms the best baseline method.
Keywords :
Internet; information filtering; pattern recognition; text analysis; Reuters corpus volume 1; TREC topics; Web information retrieval; information filtering; term-based approach; text documents; text feature mining; text patterns; Data collection; Feature extraction; Noise measurement; Support vector machines; Text mining; Training; Feature Extraction; Feature Selection; Relevance Feedback and Text Mining;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Advanced Applied Informatics (IIAIAAI), 2014 IIAI 3rd International Conference on
Conference_Location :
Kitakyushu
Print_ISBN :
978-1-4799-4174-2
Type :
conf
DOI :
10.1109/IIAI-AAI.2014.96
Filename :
6913340
Link To Document :
بازگشت