• DocumentCode
    2314833
  • Title

    A XOR-Based Hierarchical Sketch for Identifying and Estimating Hierarchical Frequent Items Online

  • Author

    Wenfeng Feng ; Guo, Qiao ; Zhang, Zhibin ; Jia, Zongpu

  • Author_Institution
    Network Inf. Center, Beijing Inst. of Technol.
  • fYear
    2006
  • fDate
    25-27 Oct. 2006
  • Firstpage
    1
  • Lastpage
    5
  • Abstract
    Many sketch data structures have been implemented. But due to our knowledge, the current sketching approach to summarizing the hierarchical structure in data was oracles-based, which "stack" logn sketches. By designing and using a XOR-based pair-wise independent family of hash functions on the hierarchical domain U*, a hierarchical sketch data structure was implemented to summarize the hierarchical structure in data stream, which mapped the data stream items to a three dimensional counter array of size LtimesDtimesW, of which, L was the number of layers in the hierarchical domain, D was the number of the hash functions chosen uniformly at random, and W was the range of the hash functions. The sketch used the breadth-first search strategy to identify and estimate the hierarchical frequent items online with user-specified probability. The experiments demonstrate substantial improvement in the update time and evaluating accuracy over the straightforward oracles-based sketch approach.
  • Keywords
    data structures; search problems; XOR-based hierarchical sketch; breadth-first search strategy; data stream; hash functions; hierarchical frequent items online; hierarchical sketch data structure; pair-wise independent family; sketch data structures; three dimensional counter array; Aggregates; Computer science; Counting circuits; Data mining; Data structures; Filtering; Frequency estimation; Telecommunication traffic; XML;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Communications and Networking in China, 2006. ChinaCom '06. First International Conference on
  • Conference_Location
    Beijing
  • Print_ISBN
    1-4244-0463-0
  • Electronic_ISBN
    1-4244-0463-0
  • Type

    conf

  • DOI
    10.1109/CHINACOM.2006.344662
  • Filename
    4149915