Title :
Fault Management Using the CONMan Abstraction
Author :
Ballani, Hitesh ; Francis, Paul
Author_Institution :
Cornell Univ., Ithaca, NY
Abstract :
Fault management in networks is difficult. We argue that a major contributor to the difficulty of debugging network faults is the sheer volume of semantically anemic details exposed by protocols. Unlike past approaches that try to cope with the deluge of information exposed, in this paper we explore how to reduce and structure the management information exposed by data-plane protocols and devices to make them more amenable to fault management. To this effect, we delineate two conditions that the management interface of data-plane protocols should satisfy: it should provide a structured description of protocol reality and it should support what we call a "conservation of bytes" invariant. Based on this, we propose an architecture wherein data- plane protocols expose management information satisfying these conditions. This allows management applications to detect, localize and (possibly) resolve faults in a structured fashion. We discuss the detection of a representative set of real-world faults to illustrate our approach. We implemented these fault management features into three protocols and built a management application that uses the features to debug faults. Apart from serving as a proof of concept, this exercise indicates that our proposal does indeed simplify debugging of a large fraction of network faults.
Keywords :
fault diagnosis; protocols; telecommunication network management; telecommunication network reliability; CONMan abstraction; data-plane protocols; fault management; management information; network fault debugging; protocol reality; Counting circuits; Debugging; Fault detection; Humans; Impedance; Information management; Packaging; Power system management; Proposals; Protocols;
Conference_Titel :
INFOCOM 2009, IEEE
Conference_Location :
Rio de Janeiro
Print_ISBN :
978-1-4244-3512-8
Electronic_ISBN :
0743-166X
DOI :
10.1109/INFCOM.2009.5061985