مرکز منطقه ای اطلاع رساني علوم و فناوري - Scalable reward learning from demonstration

DocumentCode :

3514711

Title :

Scalable reward learning from demonstration

Author :

Michini, Bernard ; Cutler, Mark ; How, Jonathan P.

Author_Institution :

Aerosp. Controls Lab., Massachusetts Inst. of Technol., Cambridge, MA, USA

fYear :

2013

fDate :

6-10 May 2013

Firstpage :

303

Lastpage :

308

Abstract :

Reward learning from demonstration is the task of inferring the intents or goals of an agent demonstrating a task. Inverse reinforcement learning methods utilize the Markov decision process (MDP) framework to learn rewards, but typically scale poorly since they rely on the calculation of optimal value functions. Several key modifications are made to a previously developed Bayesian nonparametric inverse reinforcement learning algorithm that avoid calculation of an optimal value function and no longer require discretization of the state or action spaces. Experimental results given demonstrate the ability of the resulting algorithm to scale to larger problems and learn in domains with continuous demonstrations.

Keywords :

Bayes methods; Markov processes; intelligent robots; learning (artificial intelligence); nonparametric statistics; Bayesian nonparametric inverse reinforcement learning algorithm; MDP framework; Markov decision process framework; inverse reinforcement learning method; optimal value function; optimal value functions; scalable reward learning from demonstration; Market research; Programming;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Robotics and Automation (ICRA), 2013 IEEE International Conference on

Conference_Location :

Karlsruhe

ISSN :

1050-4729

Print_ISBN :

978-1-4673-5641-1

Type :

conf

DOI :

10.1109/ICRA.2013.6630592

Filename :

6630592

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3514711