مرکز منطقه ای اطلاع رساني علوم و فناوري

DocumentCode :

1476046

Title :

Universal Reinforcement Learning

Author :

Farias, Vivek F. ; Moallemi, Ciamac C. ; Van Roy, Benjamin ; Weissman, Tsachy

Author_Institution :

Sloan Sch. of Manage., Massachusetts Inst. of Technol., Cambridge, MA, USA

Volume :

Issue :

fYear :

2010

fDate :

5/1/2010 12:00:00 AM

Firstpage :

2441

Lastpage :

2454

Abstract :

We consider an agent interacting with an unmodeled environment. At each time, the agent makes an observation, takes an action, and incurs a cost. Its actions can influence future observations and costs. The goal is to minimize the long-term average cost. We propose a novel algorithm, known as the active LZ algorithm, for optimal control based on ideas from the Lempel-Ziv scheme for universal data compression and prediction. We establish that, under the active LZ algorithm, if there exists an integer K such that the future is conditionally independent of the past given a window of K consecutive actions and observations, then the average cost converges to the optimum. Experimental results involving the game of Rock-Paper-Scissors illustrate merits of the algorithm.

Keywords :

data compression; dynamic programming; learning (artificial intelligence); optimal control; Lempel-Ziv scheme; active LZ algorithm; agent interaction; consecutive actions; consecutive observations; data compression; optimal control; prediction; rock-paper-scissors game; universal reinforcement learning; Australia; Cost function; Data compression; Dynamic programming; Engineering management; History; Information theory; Learning; Optimal control; Technology management; Context tree; Lempel-Ziv; dynamic programming; optimal control; reinforcement learning; value iteration;

fLanguage :

English

Journal_Title :

Information Theory, IEEE Transactions on

Publisher :

ieee

ISSN :

0018-9448

Type :

jour

DOI :

10.1109/TIT.2010.2043762

Filename :

5452185

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1476046