• DocumentCode
    1474064
  • Title

    Efficient and Adaptive Stateful Replication for Stream Processing Engines in High-Availability Cluster

  • Author

    Yi-Hsuan Feng ; Nen-Fu Huang ; Yen-Min Wu

  • Author_Institution
    Dept. of Comput. Sci., Nat. Tsing Hua Univ., Hsinchu, Taiwan
  • Volume
    22
  • Issue
    11
  • fYear
    2011
  • Firstpage
    1788
  • Lastpage
    1796
  • Abstract
    Stateful stream process engines in high availability clusters (HACs) track a large number of concurrent flow states and replicate them to backups to provide reliable functionality. Under high traffic loads, existing solutions in such HACs are expensive owing to precise stateful replication. This work presents two novel methods to address this issue: randomization on replication representation and a replication scheme designed for when system becomes overloaded. A hashing structure called Multilevel Counting Bloom Filter (MLCBF) is proposed as a low resource-consuming solution of stateful replication. Its performance and tradeoffs are then evaluated based on theoretic analysis and extensive trace-based tests. Trace-based simulation reveals that MLCBF reduces network and memory requirements of replication typically by over 90 percent for URL categorization. Most importantly, MLCBF is quite as simple and practical for implementation and maintenance. Moreover, an adaptive scheme called dynamic lazy insertion is designed to prevent replication from overloading system continuously and optimize the throughput of HAC. Testbed evaluation demonstrates its feasibility and effectiveness in an overloaded HAC.
  • Keywords
    data structures; filtering theory; parallel processing; HAC; URL categorization; dynamic lazy insertion; hashing structure; high-availability cluster; multilevel counting bloom filter; stateful replication; stream processing engines; trace-based simulation; Adaptive estimation; Clustering methods; Filters; Random processes; Multiple hash functions; adaptive method; bloom filters; high availability; replication.;
  • fLanguage
    English
  • Journal_Title
    Parallel and Distributed Systems, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1045-9219
  • Type

    jour

  • DOI
    10.1109/TPDS.2011.83
  • Filename
    5733339