Title :
Recursive Learning Automata for Control of Partially Observable Markov Decision Processes
Author :
Chang, Hyeong Soo ; Fu, Michael C. ; Marcus, Steven I.
Author_Institution :
Department of Computer Science and Engineering, Sogang University, Seoul, Korea; Program of Integrated Biotechnology at Sogang University. hschang@sogang.ac.kr
Abstract :
This paper presents a sampling algorithm, called "Recursive Automata Sampling Algorithm (RASA)," for control of finite horizon information-state Markov decision processes (MDPs), the equivalent model of partially observable MDPs. RASA extends in a recursive manner the Pursuit algorithm designed with learning automata by Rajaraman and Sastry for solving stochastic optimization problems. Based on the finite-time analysis of the Pursuit algorithm, we analyze the finite-time behavior of RASA, providing a bound on the probability that a given initial state takes the optimal action, and a bound on the probability that the difference between the optimal value and the estimate of it exceeds a given error. We also discuss how to apply RASA in the direct context of POMDPs and how to incorporate heuristic knowledge into RASA for on-line control.
Keywords :
Algorithm design and analysis; Automatic control; Computer science; Educational institutions; Learning automata; Pursuit algorithms; Random variables; Sampling methods; State-space methods; Stochastic processes;
Conference_Titel :
Decision and Control, 2005 and 2005 European Control Conference. CDC-ECC '05. 44th IEEE Conference on
Print_ISBN :
0-7803-9567-0
DOI :
10.1109/CDC.2005.1583136