Title :
A Web Search Analysis Considering the Intention behind Queries
Author :
Mendoza, Marcelo ; Baeza-Yates, Ricardo
Author_Institution :
Dept. of Comput. Sci., Univ. de Valparaiso, Valparaiso
Abstract :
The identification of the intention of the users behind the queries can be useful to improve the precision of the list of documents recommended by the Web search engines. That is why, recent works have focused themselves in the construction of query classifiers following the categories proposed in the scientific literature. These works have based on query representations using two sources of main information: text and click-through data. Despite of the before mentioned we have little understanding about the nature and behaviour of the variables used to characterize queries. In this work we analyse the behaviour of the variables looking for a way to improve their comprehension and to identify the characteristics that exactly allow that the query classifiers improve their precision. The analysis shows that the variables based on text have a better performance in the discrimination of the categories than the ones based on click-through data. Among these variables, the query length (number of terms that compound a query), the Levenshtein distance between snippets and queries, and the PageRank metric are recommendable features to work with query type classifiers.
Keywords :
Internet; query formulation; search engines; Levenshtein distance; PageRank metric; Web search analysis; Web search engines; query classifiers; Computer science; Information resources; Information retrieval; Mutual information; Navigation; Performance analysis; Scattering; Search engines; Taxonomy; Web search; Web usage mining; query taxonomies;
Conference_Titel :
Web Conference, 2008. LA-WEB '08., Latin American
Conference_Location :
Espfrito Santo
Print_ISBN :
978-0-7695-3397-1
Electronic_ISBN :
978-0-7695-3397-1
DOI :
10.1109/LA-WEB.2008.9