Title :
A XOR-Based Hierarchical Sketch for Identifying and Estimating Hierarchical Frequent Items Online
Author :
Wenfeng Feng ; Guo, Qiao ; Zhang, Zhibin ; Jia, Zongpu
Author_Institution :
Network Inf. Center, Beijing Inst. of Technol.
Abstract :
Many sketch data structures have been implemented. But due to our knowledge, the current sketching approach to summarizing the hierarchical structure in data was oracles-based, which "stack" logn sketches. By designing and using a XOR-based pair-wise independent family of hash functions on the hierarchical domain U*, a hierarchical sketch data structure was implemented to summarize the hierarchical structure in data stream, which mapped the data stream items to a three dimensional counter array of size LtimesDtimesW, of which, L was the number of layers in the hierarchical domain, D was the number of the hash functions chosen uniformly at random, and W was the range of the hash functions. The sketch used the breadth-first search strategy to identify and estimate the hierarchical frequent items online with user-specified probability. The experiments demonstrate substantial improvement in the update time and evaluating accuracy over the straightforward oracles-based sketch approach.
Keywords :
data structures; search problems; XOR-based hierarchical sketch; breadth-first search strategy; data stream; hash functions; hierarchical frequent items online; hierarchical sketch data structure; pair-wise independent family; sketch data structures; three dimensional counter array; Aggregates; Computer science; Counting circuits; Data mining; Data structures; Filtering; Frequency estimation; Telecommunication traffic; XML;
Conference_Titel :
Communications and Networking in China, 2006. ChinaCom '06. First International Conference on
Conference_Location :
Beijing
Print_ISBN :
1-4244-0463-0
Electronic_ISBN :
1-4244-0463-0
DOI :
10.1109/CHINACOM.2006.344662