مرکز منطقه ای اطلاع رساني علوم و فناوري - Online learning algorithms for stochastic water-filling

DocumentCode :

1728144

Title :

Online learning algorithms for stochastic water-filling

Author :

Gai, Yi ; Krishnamachari, Bhaskar

Author_Institution :

Ming Hsieh Dept. of Electr. Eng., Univ. of Southern California, Los Angeles, CA, USA

fYear :

2012

Firstpage :

352

Lastpage :

356

Abstract :

Water-filling is the term for the classic solution to the problem of allocating constrained power to a set of parallel channels to maximize the total data-rate. It is used widely in practice, for example, for power allocation to sub-carriers in multi-user OFDM systems such as WiMax. The classic water-filling algorithm is deterministic and requires perfect knowledge of the channel gain to noise ratios. In this paper we consider how to do power allocation over stochastically time-varying (i.i.d.) channels with unknown gain to noise ratio distributions. We adopt an online learning framework based on stochastic multi-armed bandits. We consider two variations of the problem, one in which the goal is to find a power allocation to maximize Σ_i E[log (1+SNR_i)], and another in which the goal is to find a power allocation to maximize Σ_i log (1+E[SNR_i]). For the first problem, we propose a cognitive water-filling algorithm that we call CWF1. We show that CWF1 obtains a regret (defined as the cumulative gap over time between the sum-rate obtained by a distribution-aware genie and this policy) that grows polynomially in the number of channels and logarithmically in time, implying that it asymptotically achieves the optimal time-averaged rate that can be obtained when the gain distributions are known. For the second problem, we present an algorithm called CWF2, which is, to our knowledge, the first algorithm in the literature on stochastic multi-armed bandits to exploit non-linear dependencies between the arms. We prove that the number of times CWF2 picks the incorrect power allocation is bounded by a function that is polynomial in the number of channels and logarithmic in time, implying that its frequency of incorrect allocation tends to zero.

Keywords :

OFDM modulation; cognitive radio; multiuser channels; optimisation; resource allocation; stochastic processes; time-varying channels; CWF1; WiMax; channel gain knowledge; classic water filling algorithm; cognitive water-filling algorithm; constrained power allocation maximization; gain to noise ratio distribution; multiuser OFDM system; nonlinear dependency; online learning algorithm; optimal time-averaged rate; parallel channel; polynomial; stochastic multiarmed bandits; stochastic water filling; stochastically time-varying channels; Channel estimation; Conferences; Learning systems; Optimization; Random variables; Resource management; Stochastic processes;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Information Theory and Applications Workshop (ITA), 2012

Conference_Location :

San Diego, CA

Print_ISBN :

978-1-4673-1473-2

Type :

conf

DOI :

10.1109/ITA.2012.6181777

Filename :

6181777

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1728144