مرکز منطقه ای اطلاع رساني علوم و فناوري - Feature Search in the Grassmanian in Online Reinforcement Learning

DocumentCode :

108664

Title :

Feature Search in the Grassmanian in Online Reinforcement Learning

Author :

Bhatnagar, Shalabh ; Borkar, Vivek S. ; Prabuchandran, K.J.

Author_Institution :

Dept. of Comput. Sci. & Autom., Indian Inst. of Sci., Bangalore, India

Volume :

Issue :

fYear :

2013

fDate :

Oct. 2013

Firstpage :

746

Lastpage :

758

Abstract :

We consider the problem of finding the best features for value function approximation in reinforcement learning and develop an online algorithm to optimize the mean square Bellman error objective. For any given feature value, our algorithm performs gradient search in the parameter space via a residual gradient scheme and, on a slower timescale, also performs gradient search in the Grassman manifold of features. We present a proof of convergence of our algorithm. We show empirical results using our algorithm as well as a similar algorithm that uses temporal difference learning in place of the residual gradient scheme for the faster timescale updates.

Keywords :

approximation theory; gradient methods; learning (artificial intelligence); search problems; Grassman manifold; feature search; gradient search; mean square Bellman error objective; online algorithm; online reinforcement learning; parameter space; residual gradient scheme; temporal difference learning; value function approximation; Approximation algorithms; Convergence; Function approximation; Learning (artificial intelligence); Signal processing algorithms; Vectors; Feature adaptation; Grassman manifold; online learning; residual gradient scheme; stochastic approximation; temporal difference learning;

fLanguage :

English

Journal_Title :

Selected Topics in Signal Processing, IEEE Journal of

Publisher :

ieee

ISSN :

1932-4553

Type :

jour

DOI :

10.1109/JSTSP.2013.2255022

Filename :

6488714

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=108664