DocumentCode :
1820050
Title :
Time-decaying Bloom Filters for data streams with skewed distributions
Author :
Kai Cheng ; Xiang, Limin
Author_Institution :
Kyushu Sangyo Univ., Japan
fYear :
2005
fDate :
3-4 April 2005
Firstpage :
63
Lastpage :
69
Abstract :
Bloom Filters are space-efficient data structures for membership queries over sets. To enable queries for multiplicities of multi-sets, the bitmap in a Bloom Filter is replaced by an array of counters whose values increment on each occurrence. In a data stream model, however, data items arrive at varying rates and recent occurrences are often regarded as more significant than past ones. In most data stream applications, it is critical to handle this "time-sensitivity". Furthermore, data streams with skewed distributions are common in many emerging applications, e.g., traffic engineering and billing, intrusion detection, trading surveillance and outlier detection. For such applications, it is inefficient to allocate counters of uniform size to all buckets. In this paper, we present Time-decaying Bloom Filters (TBF), a Bloom Filter that maintains the frequency count for each item in a data stream, and the value of each counter decays with time. For data streams with highly skewed distributions, we proposed further optimization by allowing dynamically allocating free counters to the "large" items. We performed preliminary experiments to verify the optimization.
Keywords :
data models; query processing; statistical distributions; data items; data stream model; free counter allocation; highly skewed distributions; intrusion detection; membership queries; outlier detection; space-efficient data structures; time-decaying Bloom Filters; trading surveillance; traffic engineering; Costs; Counting circuits; Data engineering; Data structures; Frequency; Information filtering; Information filters; Maintenance engineering; Search engines; Statistics;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Research Issues in Data Engineering: Stream Data Mining and Applications, 2005. RIDE-SDMA 2005. 15th International Workshop on
ISSN :
1097-8585
Print_ISBN :
0-7695-2390-0
Type :
conf
DOI :
10.1109/RIDE.2005.15
Filename :
1498232
Link To Document :
بازگشت