DocumentCode :
3426078
Title :
Estimating Query Result Sizes for Proxy Caching in Scientific Database Federations
Author :
Malik, Tanu ; Burns, Randal ; Chawla, Nitesh V. ; Szalay, Alex
Author_Institution :
Dept. of Comput. Sci., Johns Hopkins Univ., Baltimore, MD
fYear :
2006
fDate :
Nov. 2006
Firstpage :
36
Lastpage :
36
Abstract :
In a proxy cache for federations of scientific databases it is important to estimate the size of a query before making a caching decision. With accurate estimates, near-optimal cache performance can be obtained. On the other extreme, inaccurate estimates can render the cache totally ineffective. We present classification and regression over templates (CAROT), a general method for estimating query result sizes, which is suited to the resource-limited environment of proxy caches and the distributed nature of database federations. CAROT estimates query result sizes by learning the distribution of query results, not by examining or sampling data, but from observing workload. We have integrated CAROT into the proxy cache of the National Virtual Observatory (NVO) federation of astronomy databases. Experiments conducted in the NVO show that CAROT dramatically outperforms conventional estimation techniques and provides near-optimal cache performance
Keywords :
astronomy computing; cache storage; data mining; pattern classification; query processing; regression analysis; scientific information systems; astronomy database; classification tree; data mining; distributed database; machine learning; proxy cache; query size estimation; regression analysis; scientific database federation; Astronomy; Bandwidth; Computer science; Data mining; Distributed databases; Observatories; Permission; Physics; Sampling methods; Yield estimation; data mining; proxy caching; scientific federations;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
SC 2006 Conference, Proceedings of the ACM/IEEE
Conference_Location :
Tampa, FL
Print_ISBN :
0-7695-2700-0
Electronic_ISBN :
0-7695-2700-0
Type :
conf
DOI :
10.1109/SC.2006.27
Filename :
4090210
Link To Document :
بازگشت