DocumentCode :
2465144
Title :
Automatic construction of an evaluation dataset from wisdom of the crowds for information retrieval applications
Author :
Wang, Chieh-Jen ; Huang, Hung-Sheng ; Chen, Hsin-Hsi
Author_Institution :
Dept. of Comput. Sci. & Inf. Eng., Nat. Taiwan Univ., Taipei, Taiwan
fYear :
2012
fDate :
14-17 Oct. 2012
Firstpage :
490
Lastpage :
495
Abstract :
A benchmark evaluation dataset which reflects users´ search behaviors in the real world is indispensable for evaluating the performance of information retrieval applications. A typical evaluation dataset consists of a document set, a topic set and relevance judgments. Manual preparation of an evaluation dataset needs much human cost, and human-made topics may not fully capture users´ real search needs. This paper aims at automatically constructing an evaluation dataset from wisdom of the crowds in search query logs for information retrieval applications. We begin with collecting documents of clicked documents in search query logs, selecting suitable queries in terms of topics, sampling documents from the document collection for each query and estimating the multi-level relevance of document samples based on click count, normalized count and average count functions. The machine-made evaluation dataset is trained and tested by three learning to rank algorithms, including linear regression, SVMRank and FRank. We compare their performance on a testing collection MQ2007 of LETOR which is a well-known human-made benchmark dataset for learning to rank. The experimental results show that the performance tendency is similar by using machine-made and human-made evaluation datasets. That demonstrates our proposed models can construct an evaluation dataset with similar quality of human-made.
Keywords :
document handling; information retrieval; learning (artificial intelligence); regression analysis; sampling methods; support vector machines; user interfaces; FRank algorithm; SVM algorithm; average count function; click count function; document collection; document sampling; document set; human-made evaluation dataset; information retrieval application; learning-to-rank algorithm; linear regression algorithm; machine-made evaluation dataset; normalized count function; relevance judgment; search query log; support vector machines; topic set; user search behavior; user search need; Humans; Information retrieval; Linear regression; Measurement; Predictive models; Testing; Training; evaluation dataset construction; retrieval evaluation; search query logs analysis;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Systems, Man, and Cybernetics (SMC), 2012 IEEE International Conference on
Conference_Location :
Seoul
Print_ISBN :
978-1-4673-1713-9
Electronic_ISBN :
978-1-4673-1712-2
Type :
conf
DOI :
10.1109/ICSMC.2012.6377772
Filename :
6377772
Link To Document :
بازگشت