Title of article :
Using structural information to improve search in Web collections
Author/Authors :
Edleno S. de Moura1، نويسنده , , David Fernandes2، نويسنده , , Berthier Ribeiro-Neto2، نويسنده , , Altigran S. da Silva3، نويسنده , , Marcos André Gonçalves4، نويسنده ,
Issue Information :
ماهنامه با شماره پیاپی سال 2010
Pages :
11
From page :
2503
To page :
2513
Abstract :
In this work, we investigate the problem of using the block structure of Web pages to improve ranking results. Starting with basic intuitions provided by the concepts of term frequency (TF) and inverse document frequency (IDF), we propose nine block-weight functions to distinguish the impact of term occurrences inside page blocks, instead of inside whole pages. These are then used to compute a modified BM25 ranking function. Using four distinct Web collections, we ran extensive experiments to compare our block-weight ranking formulas with two other baselines: (a) a BM25 ranking applied to full pages, and (b) a BM25 ranking that takes into account best blocks. Our methods suggest that our block-weighting ranking method is superior to all baselines across all collections we used and that average gain in precision figures from 5 to 20% are generated.
Journal title :
Journal of the American Society for Information Science and Technology
Serial Year :
2010
Journal title :
Journal of the American Society for Information Science and Technology
Record number :
994351
Link To Document :
بازگشت