DocumentCode
3268750
Title
A probabilistic approach to metasearching with adaptive probing
Author
Liu, Zhenyu ; Luo, Chang ; Cho, Junghoo ; Chu, Wesley W.
Author_Institution
Dept. of Comput. Sci., California Univ., Los Angeles, CA, USA
fYear
2004
fDate
30 March-2 April 2004
Firstpage
547
Lastpage
558
Abstract
An ever-increasing amount of valuable information is stored in Web databases, "hidden" behind search interfaces. To save the user\´s effort in manually exploring each database, metasearchers automatically select the most relevant databases to a user\´s query. In this paper, we focus on one of the technical challenges in metasearching, namely database selection. Past research uses a precollected summary of each database to estimate its "relevancy" to the query, and in many cases make incorrect database selection. In this paper, we propose two techniques: probabilistic relevancy modelling and adaptive probing. First, we model the relevancy of each database to a given query as a probabilistic distribution, derived by sampling that database. Using the probabilistic model, the user can explicitly specify a desired level of certainty for database selection. The adaptive probing technique decides which and how many databases to contact in order to satisfy the user\´s requirement. Our experiments on real hidden-Web databases indicate that our approach significantly improves the accuracy of database selection at the cost of a small number of database probing.
Keywords
Internet; distributed databases; meta data; probability; query processing; Web databases; adaptive probing technique; database probing; hidden-Web databases; metasearchers; probabilistic relevancy modelling; Computer science; Costs; Databases; Internet; Merging; Sampling methods; Search engines; Web search;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Engineering, 2004. Proceedings. 20th International Conference on
ISSN
1063-6382
Print_ISBN
0-7695-2065-0
Type
conf
DOI
10.1109/ICDE.2004.1320026
Filename
1320026
Link To Document