DocumentCode :
1728144
Title :
Online learning algorithms for stochastic water-filling
Author :
Gai, Yi ; Krishnamachari, Bhaskar
Author_Institution :
Ming Hsieh Dept. of Electr. Eng., Univ. of Southern California, Los Angeles, CA, USA
fYear :
2012
Firstpage :
352
Lastpage :
356
Abstract :
Water-filling is the term for the classic solution to the problem of allocating constrained power to a set of parallel channels to maximize the total data-rate. It is used widely in practice, for example, for power allocation to sub-carriers in multi-user OFDM systems such as WiMax. The classic water-filling algorithm is deterministic and requires perfect knowledge of the channel gain to noise ratios. In this paper we consider how to do power allocation over stochastically time-varying (i.i.d.) channels with unknown gain to noise ratio distributions. We adopt an online learning framework based on stochastic multi-armed bandits. We consider two variations of the problem, one in which the goal is to find a power allocation to maximize Σi E[log (1+SNRi)], and another in which the goal is to find a power allocation to maximize Σi log (1+E[SNRi]). For the first problem, we propose a cognitive water-filling algorithm that we call CWF1. We show that CWF1 obtains a regret (defined as the cumulative gap over time between the sum-rate obtained by a distribution-aware genie and this policy) that grows polynomially in the number of channels and logarithmically in time, implying that it asymptotically achieves the optimal time-averaged rate that can be obtained when the gain distributions are known. For the second problem, we present an algorithm called CWF2, which is, to our knowledge, the first algorithm in the literature on stochastic multi-armed bandits to exploit non-linear dependencies between the arms. We prove that the number of times CWF2 picks the incorrect power allocation is bounded by a function that is polynomial in the number of channels and logarithmic in time, implying that its frequency of incorrect allocation tends to zero.
Keywords :
OFDM modulation; cognitive radio; multiuser channels; optimisation; resource allocation; stochastic processes; time-varying channels; CWF1; WiMax; channel gain knowledge; classic water filling algorithm; cognitive water-filling algorithm; constrained power allocation maximization; gain to noise ratio distribution; multiuser OFDM system; nonlinear dependency; online learning algorithm; optimal time-averaged rate; parallel channel; polynomial; stochastic multiarmed bandits; stochastic water filling; stochastically time-varying channels; Channel estimation; Conferences; Learning systems; Optimization; Random variables; Resource management; Stochastic processes;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Information Theory and Applications Workshop (ITA), 2012
Conference_Location :
San Diego, CA
Print_ISBN :
978-1-4673-1473-2
Type :
conf
DOI :
10.1109/ITA.2012.6181777
Filename :
6181777
Link To Document :
بازگشت