Title :
Classification-Based Approximate Policy Iteration
Author :
Farahmand, Amir-massoud ; Precup, Doina ; Barreto, Andre M. S. ; Ghavamzadeh, Mohammad
Author_Institution :
Sch. of Comput. Sci., McGill Univ., Montreal, QC, Canada
Abstract :
Tackling large approximate dynamic programming or reinforcement learning problems requires methods that can exploit regularities of the problem in hand. Most current methods are geared towards exploiting the regularities of either the value function or the policy. We introduce a general classification-based approximate policy iteration (CAPI) framework that can exploit regularities of both. We establish theoretical guarantees for the sample complexity of CAPI-style algorithms, which allow the policy evaluation step to be performed by a wide variety of algorithms, and can handle nonparametric representations of policies. Our bounds on the estimation error of the performance loss are tighter than existing results.
Keywords :
approximation theory; dynamic programming; iterative methods; learning (artificial intelligence); pattern classification; CAPI framework; classification-based approximate policy iteration; dynamic programming; reinforcement learning; Aerospace electronics; Algorithm design and analysis; Approximation algorithms; Complexity theory; Convergence; Estimation error; Upper bound; Approximate Dynamic Programming; Approximate Policy Iteration; Approximate dynamic programming; Classification; Finite-Sample Analysis; Reinforcement Learning; approximate policy iteration; classification; finite-sample analysis; reinforcement learning;
Journal_Title :
Automatic Control, IEEE Transactions on
DOI :
10.1109/TAC.2015.2418411