DocumentCode :
3085985
Title :
Reservoir Sampling over Memory-Limited Stream Joins
Author :
Al-Kateb, Mohammed ; Lee, Byung Suk ; Wang, X. Sean
Author_Institution :
Univ. of Vermont, Burlington
fYear :
2007
fDate :
9-11 July 2007
Firstpage :
23
Lastpage :
23
Abstract :
In stream join processing with limited memory, uniform random sampling is useful for approximate query evaluation. In this paper, we address the problem of reservoir sampling over memory-limited stream joins. We present two sampling algorithms, reservoir join-sampling (RJS) and progressive reservoir join-sampling (PRJS). RJS is designed straightforwardly by using a fixed-size reservoir sampling on a join-sample (i.e., random sample of a join output stream). Anytime the sample in the reservoir is used, RJS always gives a uniform random sample of the original join output stream. With limited memory, however, the available memory may not be large enough even for the join buffer, thereby severely limiting the reservoir size. PRJS alleviates this problem by increasing the reservoir size during the join-sampling. This increasing is possible since the memory requirement by the join-sampling algorithm decreases over time. A larger reservoir provides a closer representation of the original join output stream. However, it comes with a negative impact on the probability of the sample being uniform. Through experiments we examine the tradeoffs and compare the two algorithms in terms of the aggregation error on the reservoir sample.
Keywords :
query processing; random processes; approximate query evaluation; memory-limited stream joins; progressive reservoir join-sampling; reservoir sampling; stream join processing; uniform random sampling; Algorithm design and analysis; Buffer storage; Computer science; Conference management; Databases; Query processing; Reservoirs; Sampling methods; Statistical analysis; Wireless sensor networks;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Scientific and Statistical Database Management, 2007. SSBDM '07. 19th International Conference on
Conference_Location :
Banff, Alta.
ISSN :
1551-6393
Print_ISBN :
0-7695-2868-6
Electronic_ISBN :
1551-6393
Type :
conf
DOI :
10.1109/SSDBM.2007.40
Filename :
4274968
Link To Document :
بازگشت