• DocumentCode
    3085974
  • Title

    Adaptive-Size Reservoir Sampling over Data Streams

  • Author

    Al-Kateb, Mohammed ; Lee, Byung Suk ; Wang, X. Sean

  • Author_Institution
    Vermont Univ., Burlington
  • fYear
    2007
  • fDate
    9-11 July 2007
  • Firstpage
    22
  • Lastpage
    22
  • Abstract
    Reservoir sampling is a well-known technique for sequential random sampling over data streams. Conventional reservoir sampling assumes a fixed-size reservoir. There are situations, however, in which it is necessary and/or advantageous to adaptively adjust the size of a reservoir in the middle of sampling due to changes in data characteristics and/or application behavior. This paper studies adaptive size reservoir sampling over data streams considering two main factors: reservoir size and sample uniformity. First, the paper conducts a theoretical study on the effects of adjusting the size of a reservoir while sampling is in progress. The theoretical results show that such an adjustment may bring a negative impact on the probability of the sample being uniform (called uniformity confidence herein). Second, the paper presents a novel algorithm for maintaining the reservoir sample after the reservoir size is adjusted such that the resulting uniformity confidence exceeds a given threshold. Third, the paper extends the proposed algorithm to an adaptive multi-reservoir sampling algorithm for a practical application in which samples are collected from memory-limited wireless sensor networks using a mobile sink. Finally, the paper empirically examines the adaptivity of the multi-reservoir sampling algorithm with regard to reservoir size and sample uniformity using real sensor networks data sets.
  • Keywords
    adaptive signal processing; distributed databases; signal sampling; wireless sensor networks; adaptive multireservoir sampling algorithm; adaptive-size reservoir sampling; data stream; fixed-size reservoir; memory-limited wireless sensor network; mobile sink; sequential random sampling; Computer science; Monitoring; Navigation; Query processing; Reservoirs; Sampling methods; Sensor phenomena and characterization; Spatial databases; Warehousing; Wireless sensor networks;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Scientific and Statistical Database Management, 2007. SSBDM '07. 19th International Conference on
  • Conference_Location
    Banff, Alta.
  • ISSN
    1551-6393
  • Print_ISBN
    0-7695-2868-6
  • Electronic_ISBN
    1551-6393
  • Type

    conf

  • DOI
    10.1109/SSDBM.2007.29
  • Filename
    4274967