Title of article :
Some Formal Analysis of Rocchios Similarity-Based Relevance Feedback Algorithm
Author/Authors :
Chen، Zhixiang نويسنده , , Zhu، Binhai نويسنده ,
Issue Information :
روزنامه با شماره پیاپی سال 2002
Pages :
-60
From page :
61
To page :
0
Abstract :
Rocchioʹs similarity-based Relevance feedback algorithm, one of the most important query reformation methods in information retrieval, is essentially an adaptive supervised learning algorithm from examples. In spite of its popularity in various applications there is little rigorous analysis of its learning complexity in literature. In this paper we show that in the binary vector space model, if the initial query vector is 0, then for any of the four typical similarities (inner product, dice coefficient, cosine coefficient, and Jaccard coefficient), Rocchioʹs similarity-based relevance feedback algorithm makes at least n mistakes when used to search for a collection of documents represented by a monotone disjunction of at most k relevant features (or terms) over the ndimensional binary vector space {0, 1}n. When an arbitrary initial query vector in {0, 1}n is used, it makes at least (n + k 3)/2 mistakes to search for the same collection of documents. The linear lower bounds are independent of the choices of the threshold and coefficients that the algorithm may use in updating its query vector and making its classification.
Keywords :
Supervised learning , lower bound , relevance feedback , vector space , Similarity
Journal title :
INFORMATION RETRIEVAL
Serial Year :
2002
Journal title :
INFORMATION RETRIEVAL
Record number :
89799
Link To Document :
بازگشت