Title :
A potential-based method for finite-stage Markov Decision Process
Author_Institution :
Dept. of Autom., Tsinghua Univ., Beijing
Abstract :
Finite-stage Markov decision process (MDP) supplies a general framework for many practical problems when only the performance in a finite duration is of interest. Dynamic programming (DP) supplies a general way to find the optimal policies but is usually practically infeasible, due to the exponentially increasing policy space. Approximating the finite-stage MDP by an infinite-stage MDP reduces the search space but usually does not find the optimal stationary policy, due to the approximation error. We develop a method that finds the optimal stationary policies for the finite-stage MDP. The method is based on performance potentials, which can be estimated through sample paths and thus suits practical application.
Keywords :
Markov processes; dynamic programming; dynamic programming; finite-stage Markov decision process; optimal stationary policy; potential-based method; Approximation error; Decision making; Dynamic programming; Educational institutions; Space stations; Space technology; State estimation; State-space methods; USA Councils; Uncertainty; Performance potentials; finite-stage Markov Decision Processes; policy iteration; stationary policy;
Conference_Titel :
American Control Conference, 2008
Conference_Location :
Seattle, WA
Print_ISBN :
978-1-4244-2078-0
Electronic_ISBN :
0743-1619
DOI :
10.1109/ACC.2008.4587291