Title :
Applying Data Mining to Pseudo-Relevance Feedback for High Performance Text Retrieval
Author :
Huang, Xiangji ; Huang, Yan Rui ; Wen, Miao ; An, Aijun ; Liu, Yang ; Poon, Josiah
Author_Institution :
Sch. of Inf. Technol., York Univ., Toronto, ON
Abstract :
In this paper, we investigate the use of data mining, in particular the text classification and co-training techniques, to identify more relevant passages based on a small set of labeled passages obtained from the blind feedback of a retrieval system. The data mining results are used to expand query terms and to re-estimate some of the parameters used in a probabilistic weighting function. We evaluate the data mining based feedback method on the TREC HARD data set. The results show that data mining can be successfully applied to improve the text retrieval performance. We report our experimental findings in detail.
Keywords :
data mining; pattern classification; relevance feedback; text analysis; blind feedback; data mining; high performance text retrieval system; labeled passages; probabilistic weighting function; pseudo-relevance feedback; text classification; Computer science; Data mining; Feedback; Information retrieval; Information technology; Labeling; Supervised learning; Testing; Text categorization; Training data;
Conference_Titel :
Data Mining, 2006. ICDM '06. Sixth International Conference on
Conference_Location :
Hong Kong
Print_ISBN :
0-7695-2701-7
DOI :
10.1109/ICDM.2006.22