• DocumentCode
    2695746
  • Title

    Using reinforcement learning for pro-active network fault management

  • Author

    He, Qiming ; Shayman, Mark A.

  • Author_Institution
    Dept. of Electr. & Comput. Eng., Maryland Univ., College Park, MD, USA
  • Volume
    1
  • fYear
    2000
  • fDate
    2000
  • Firstpage
    515
  • Abstract
    For high-speed networks, it is important that fault management be pro-active i.e., detect, diagnose, and mitigate problems before they result in severe degradation of network performance. Pro-active fault management depends on monitoring the network to obtain the data on which to base manager decisions. However, monitoring introduces additional overhead that may itself degrade network performance, especially when the network is in a stressed state. Thus, a trade-off must be made between the amount of data collected and transferred on one hand, and the speed and accuracy of fault detection and diagnosis on the other hand. Such a trade-off can be naturally formulated as a partially observable Markov decision process (POMDP) whose solution can be used to construct a decision-rule for both centralized and distributed intelligent agents. Since the exact solution of POMDPs for a realistic number of states is computationally prohibitive, we develop a reinforcement-learning-based fast algorithm which learns the decision-rule in an approximate network simulator and makes it fast deployable to the real network. Simulation results are given to diagnose a switch fault in an ATM network
  • Keywords
    Markov processes; computerised monitoring; decision theory; fault diagnosis; learning (artificial intelligence); telecommunication computing; telecommunication network management; ATM network; data collection; data transfer; decision-rule; fault detection; fault diagnosis; high-speed networks; intelligent agents; manager decisions; network monitoring; network performance; partially observable Markov decision process; proactive network fault management; reinforcement learning; switch fault; Computational modeling; Computer networks; Degradation; Fault detection; Fault diagnosis; High-speed networks; Intelligent agent; Learning; Monitoring; Switches;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Communication Technology Proceedings, 2000. WCC - ICCT 2000. International Conference on
  • Conference_Location
    Beijing
  • Print_ISBN
    0-7803-6394-9
  • Type

    conf

  • DOI
    10.1109/ICCT.2000.889257
  • Filename
    889257