• DocumentCode
    566177
  • Title

    Online learning to optimize transmission over an unknown Gilbert-Elliott Channel

  • Author

    Wu, Yanting ; Krishnamachari, Bhaskar

  • Author_Institution
    Dept. of Electrical Engineering, University of Southern California, USA
  • fYear
    2012
  • fDate
    14-18 May 2012
  • Firstpage
    27
  • Lastpage
    32
  • Abstract
    This paper studies the optimal transmission policy for a Gilbert-Elliott Channel. The transmitter has two actions: sending aggressively or sending conservatively, with rewards depending on the action chosen and the underlying channel state. The aim is to compute the scheduling policy to determine which actions to choose at each time slot in order to maximize the expected total discounted reward. We first establish the threshold structure of the optimal policy when the underlying channel statistics are known. We then consider the more challenging case when the statistics are unknown. For this problem, we map different threshold policies to arms of a suitably defined multiarmed bandit problem. To tractably handle the complexity introduced by countably infinite arms and the infinite time horizon, we weaken our objective a little: finding a (OPT − (∈ + δ))-approximate policy instead. We present the UCB-P algorithm, which can achieve this objective with logarithmic-time regret.
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Modeling and Optimization in Mobile, Ad Hoc and Wireless Networks (WiOpt), 2012 10th International Symposium on
  • Conference_Location
    Paderborn, Germany
  • Print_ISBN
    978-1-4673-2294-2
  • Electronic_ISBN
    978-3-901882-47-0
  • Type

    conf

  • Filename
    6260468