مرکز منطقه ای اطلاع رساني علوم و فناوري - Stochastic policy search for variance-penalized semi-Markov control

DocumentCode :

3274810

Title :

Stochastic policy search for variance-penalized semi-Markov control

Author :

Gosavi, Abhijit ; Purohit, Mandar

Author_Institution :

219 Eng. Manage., Missouri Univ. of Sci. & Technol., Rolla, MO, USA

fYear :

2011

fDate :

11-14 Dec. 2011

Firstpage :

2860

Lastpage :

2871

Abstract :

The variance-penalized metric in Markov decision processes (MDPs) seeks to maximize the average reward minus a scalar times the variance of rewards. In this paper, our goal is to study the same metric in the context of the semi-Markov decision process (SMDP). In the SMDP, unlike the MDP, the time spent in each transition is not identical and may in fact be a random variable. We first develop an expression for the variance of rewards in the SMDPs, and then formulate the VP-SMDP. Our interest here is in solving the problem without generating the underlying transition probabilities of the Markov chains. We propose the use of two stochastic search techniques, namely simultaneous perturbation and learning automata, to solve the problem; these techniques use stochastic policies and can be used within simulators, thereby avoiding the generation of the transition probabilities.

Keywords :

Markov processes; learning automata; probability; problem solving; Markov chains; SMDP; learning automata; problem solving; semi-Markov decision process; stochastic search techniques; transition probability; variance-penalized semi-Markov control; Computational modeling; Limiting; Markov processes; Measurement; Optimization; Vectors;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Simulation Conference (WSC), Proceedings of the 2011 Winter

Conference_Location :

Phoenix, AZ

ISSN :

0891-7736

Print_ISBN :

978-1-4577-2108-3

Electronic_ISBN :

0891-7736

Type :

conf

DOI :

10.1109/WSC.2011.6147989

Filename :

6147989

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3274810