• DocumentCode
    847996
  • Title

    Reinforcement Learning for Resource Allocation in LEO Satellite Networks

  • Author

    Usaha, Wipawee ; Barria, Javier A.

  • Author_Institution
    Sch. of Telecommun. Eng., Suranaree Univ. of Technol., Nakorn Ratchasima
  • Volume
    37
  • Issue
    3
  • fYear
    2007
  • fDate
    6/1/2007 12:00:00 AM
  • Firstpage
    515
  • Lastpage
    527
  • Abstract
    In this paper, we develop and assess online decision-making algorithms for call admission and routing for low Earth orbit (LEO) satellite networks. It has been shown in a recent paper that, in a LEO satellite system, a semi-Markov decision process formulation of the call admission and routing problem can achieve better performance in terms of an average revenue function than existing routing methods. However, the conventional dynamic programming (DP) numerical solution becomes prohibited as the problem size increases. In this paper, two solution methods based on reinforcement learning (RL) are proposed in order to circumvent the computational burden of DP. The first method is based on an actor-critic method with temporal-difference (TD) learning. The second method is based on a critic-only method, called optimistic TD learning. The algorithms enhance performance in terms of requirements in storage, computational complexity and computational time, and in terms of an overall long-term average revenue function that penalizes blocked calls. Numerical studies are carried out, and the results obtained show that the RL framework can achieve up to 56% higher average revenue over existing routing methods used in LEO satellite networks with reasonable storage and computational requirements
  • Keywords
    Markov processes; decision making; dynamic programming; learning (artificial intelligence); resource allocation; satellite communication; telecommunication computing; telecommunication congestion control; telecommunication network routing; call admission control; computational complexity; dynamic programming; low Earth orbit satellite network; online decision-making algorithm; reinforcement learning; resource allocation; semi-Markov decision process; temporal-difference learning; Bandwidth; Costs; Dynamic programming; Learning; Low earth orbit satellites; Propagation delay; Resource management; Routing; Satellite broadcasting; Topology; Call admission control (CAC); low Earth orbit (LEO) satellite network; reinforcement learning (RL); routing; temporal-difference (TD) learning; Algorithms; Artificial Intelligence; Computer Communication Networks; Decision Support Techniques; Pattern Recognition, Automated; Resource Allocation; Signal Processing, Computer-Assisted; Spacecraft;
  • fLanguage
    English
  • Journal_Title
    Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1083-4419
  • Type

    jour

  • DOI
    10.1109/TSMCB.2006.886173
  • Filename
    4200818