• DocumentCode
    2302196
  • Title

    Partition-Tolerant Distributed Publish/Subscribe Systems

  • Author

    Kazemzadeh, Reza Sherafat ; Jacobsen, Hans-Arno

  • Author_Institution
    Univ. of Toronto, Toronto, ON, Canada
  • fYear
    2011
  • fDate
    4-7 Oct. 2011
  • Firstpage
    101
  • Lastpage
    110
  • Abstract
    In this paper, we develop reliable distributed publish/subscribe algorithms that can tolerate concurrent failure of up to d broker machines or communication links. In our approach, d is a configuration parameter which determines the level of fault-tolerance of the system and reliability refers to exactly-once and per-source, in-order delivery of publications to clients with matching subscriptions. We propose protocols to address three problems in presence of broker or link failures: (i) subscription propagation, (ii) publication forwarding, and (iii) broker recovery. Finally, we study the effectiveness of our approach when the number of concurrent failures exceeds d. Through large-scale experimental evaluations with up to 500 brokers, we demonstrate that a system configured with a modest value of d = 3 is able to reliably deliver 97% of publications in presence of failure of up to 17% of its brokers.
  • Keywords
    message passing; middleware; broker machines; broker recovery; configuration parameter; partition tolerant distributed publish systems; partition tolerant distributed subscribe systems; publication forwarding; subscription propagation; Computer crashes; Detectors; Reliability; Routing; Servers; Subscriptions; Videos; Fault-Tolerance; Publish/Subscribe; Reliability;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Reliable Distributed Systems (SRDS), 2011 30th IEEE Symposium on
  • Conference_Location
    Madrid
  • ISSN
    1060-9857
  • Print_ISBN
    978-1-4577-1349-1
  • Type

    conf

  • DOI
    10.1109/SRDS.2011.21
  • Filename
    6076767