Dynamically identifying regenerative cycles in Simulation-based Optimization Algorithms for Markov chains

Author

Campos-Nanez, Enrique ; Patek, S.D.

Author_Institution

Dept. of Eng. Manage. & Syst. Eng., George Washington Univ., DC, USA

Volume

49

Issue

6

fYear

2004

fDate

6/1/2004 12:00:00 AM

Firstpage

1022

Lastpage

1025

Abstract

Simulation-based algorithms for maximizing the average reward of a parameterized Markov chain often rely upon the existence of a state which is recurrent for all choices of parameter values. The question of which recurrent state should serve to mark the end of a regenerative cycle is a very important practical consideration in applications. Even when all of the states of the process are recurrent, some states tend to be visited more often than others, and lengthy renewal cycles tend to result in high variance estimates of the gradient. To address this difficulty, we analyze a mechanism for adjusting this special state dynamically (i^*-adaptation) as applied to the "batch" simulation-based optimization algorithm of a previous paper. We show that the desirable convergence properties of the original "batch" algorithm are retained with i^*-adaptation, namely the almost sure convergence of the parameter vector to a critical point.

Keywords

Markov processes; convergence; gradient methods; optimisation; Markov chains; convergence; gradient method; parameter vector; recurrent state; regenerative cycles identification; simulation based optimization; Algorithm design and analysis; Analytical models; Computational modeling; Convergence; Costs; Engineering profession; Recursive estimation; State estimation; State-space methods; Systems engineering and theory; Markov chains; regenerative cycles; simulation-based optimization;

fLanguage

English

Journal_Title

Automatic Control, IEEE Transactions on

Publisher

ieee

ISSN

0018-9286

Type

jour

DOI

10.1109/TAC.2004.829637

Filename

1304931