• DocumentCode
    1840477
  • Title

    Self checking network protocols: a monitor based approach

  • Author

    Khanna, Gunjan ; Varadharajan, Padma ; Bagchi, Saurabh

  • Author_Institution
    Sch. of Electr. & Comput. Eng., Purdue Univ., West Lafayette, IN, USA
  • fYear
    2004
  • fDate
    18-20 Oct. 2004
  • Firstpage
    18
  • Lastpage
    30
  • Abstract
    The wide deployment of high-speed computer networks has made distributed systems ubiquitous in today´s connected world. The machines on which the distributed applications are hosted are heterogeneous in nature, the applications often run legacy code without the availability of their source code, the systems are of very large scales, and often have soft real-time guarantees. In this paper, we target the problem of online detection of disruptions through a generic external entity called Monitor that is able to observe the exchanged messages between the protocol participants and deduce any ongoing disruption by matching against a rule base composed of combinatorial and temporal rules. The Monitor architecture is application neutral, with the rule base making it specific to a protocol. To make the detection infrastructure scalable and dependable, we extend it to a hierarchical Monitor structure. The infrastructure is applied to a streaming video application running on a reliable multicast protocol called TRAM installed on the campus wide network. The evaluation brings out the scalability of the monitor infrastructure and detection coverage under different kinds of faults for the single level and the hierarchical arrangements.
  • Keywords
    checkpointing; computer networks; message passing; protocols; supervisory programs; video streaming; TRAM; combinatorial rules; computer networks; hierarchical monitor structure; legacy code; message exchange; monitor based approach; multicast protocol; online disruption detection; self checking network protocols; source code; streaming video; temporal rules; ubiquitous distributed systems; Monitoring; Protocols;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Reliable Distributed Systems, 2004. Proceedings of the 23rd IEEE International Symposium on
  • ISSN
    1060-9857
  • Print_ISBN
    0-7695-2239-4
  • Type

    conf

  • DOI
    10.1109/RELDIS.2004.1353000
  • Filename
    1353000