DocumentCode :
3268750
Title :
A probabilistic approach to metasearching with adaptive probing
Author :
Liu, Zhenyu ; Luo, Chang ; Cho, Junghoo ; Chu, Wesley W.
Author_Institution :
Dept. of Comput. Sci., California Univ., Los Angeles, CA, USA
fYear :
2004
fDate :
30 March-2 April 2004
Firstpage :
547
Lastpage :
558
Abstract :
An ever-increasing amount of valuable information is stored in Web databases, "hidden" behind search interfaces. To save the user\´s effort in manually exploring each database, metasearchers automatically select the most relevant databases to a user\´s query. In this paper, we focus on one of the technical challenges in metasearching, namely database selection. Past research uses a precollected summary of each database to estimate its "relevancy" to the query, and in many cases make incorrect database selection. In this paper, we propose two techniques: probabilistic relevancy modelling and adaptive probing. First, we model the relevancy of each database to a given query as a probabilistic distribution, derived by sampling that database. Using the probabilistic model, the user can explicitly specify a desired level of certainty for database selection. The adaptive probing technique decides which and how many databases to contact in order to satisfy the user\´s requirement. Our experiments on real hidden-Web databases indicate that our approach significantly improves the accuracy of database selection at the cost of a small number of database probing.
Keywords :
Internet; distributed databases; meta data; probability; query processing; Web databases; adaptive probing technique; database probing; hidden-Web databases; metasearchers; probabilistic relevancy modelling; Computer science; Costs; Databases; Internet; Merging; Sampling methods; Search engines; Web search;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Engineering, 2004. Proceedings. 20th International Conference on
ISSN :
1063-6382
Print_ISBN :
0-7695-2065-0
Type :
conf
DOI :
10.1109/ICDE.2004.1320026
Filename :
1320026
Link To Document :
بازگشت