• DocumentCode
    3447
  • Title

    H-Tree: An Efficient Index Structurefor Event Matching in Content-BasedPublish/Subscribe Systems

  • Author

    Shiyou Qian ; Jian Cao ; Yanmin Zhu ; Minglu Li ; Jie Wang

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Shanghai Jiao Tong Univ., Shanghai, China
  • Volume
    26
  • Issue
    6
  • fYear
    2015
  • fDate
    June 1 2015
  • Firstpage
    1622
  • Lastpage
    1632
  • Abstract
    Content-based publish/subscribe systems have been employed to deal with complex distributed information flows in many applications. It is well recognized that event matching is a fundamental component of such large-scale systems. Event matching searches a space which is composed of all subscriptions. As the scale and complexity of a system grows, the efficiency of event matching becomes more critical to system performance. However, most existing methods suffer significant performance degradation when the system has large numbers of both subscriptions and their component constraints. In this paper, we present Hash Tree (H-Tree), a highly efficient index structure for event matching. H-Tree is a hash table in nature that is a combination of hash lists and hash chaining. A hash list is built up on an indexed attribute by realizing novel overlapping divisions of the attribute´s value domain, providing more efficient space consumption. Multiple hash lists are then combined into a hash tree. The basic idea behind H-Tree is that matching efficiencies are improved when the search space is substantially reduced by pruning most of the subscriptions that are not matched. We have implemented H-Tree and conducted extensive experiments in different settings. Experimental results demonstrate that H-Tree has better performance than its counterparts by a large margin. In particular, the matching speed is faster by three orders of magnitude than its counterparts when the numbers of both subscriptions and their component constraints are huge.
  • Keywords
    content management; data mining; indexing; message passing; middleware; pattern matching; tree data structures; H-tree; attribute value domain; complex distributed information flow; content-based publish/subscribe systems; event matching; hash chaining; hash list; hash table; hash tree; index structure; indexed attribute; large-scale system; Field-flow fractionation; Indexes; Memory management; Random variables; Silicon; Subscriptions; Upper bound; Content-based publish/subscribe; event matching; index structure; matching time; performance;
  • fLanguage
    English
  • Journal_Title
    Parallel and Distributed Systems, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1045-9219
  • Type

    jour

  • DOI
    10.1109/TPDS.2014.2323262
  • Filename
    6814863