مرکز منطقه ای اطلاع رساني علوم و فناوري - Mode-matching control policies for multi-mode Markov decision processes

DocumentCode :

1751307

Title :

Mode-matching control policies for multi-mode Markov decision processes

Author :

Ren, Zhinyuan ; Krogh, Bruce H.

Author_Institution :

Dept. of Electr. & Comput. Eng., Carnegie Mellon Univ., Pittsburgh, PA, USA

Volume :

fYear :

2001

fDate :

2001

Firstpage :

Abstract :

We consider a Markov decision process (MDP) with a two-dimensional state vector, (S, D), where S is interpreted as the system state, and D is interpreted as the probability distribution of the system operating mode, denoted by M. The mode M determines the probability transition and reward structures for S. If the mode were known and constant, a constant-mode optimal controller for controlling the evolution of S could be computed offline. We are interested in knowing if and when the set of constant-mode optimal controllers can be used to control the system effectively when the mode evolves stochastically. We propose a mode-matching control policy under which the controller applied to the system at each epoch is the constant-mode optimal controller for the current most likely mode. We consider the case when the current mode is directly observable (that is, D is the trivial distribution) as well as the case when only the probability distribution of the current mode is available at each control epoch. Sufficient conditions under which the mode-matching control policies are optimal are derived. We also derive bounds on the performance degradation from the optimum when the non-optimal mode-matching control policies are used. The problem formulation, sufficient conditions and performance bounds are illustrated by a numerical example

Keywords :

Markov processes; decision theory; optimal control; probability; stochastic systems; constant-mode optimal controller; discrete-time finite state Markov chain; mode-matching control policy; multi-mode Markov decision processes; performance degradation; probability distribution; reward structures; stochastic evolution; system operating mode; two-dimensional state vector; Adaptive control; Control system synthesis; Control systems; Degradation; Distributed computing; Optimal control; Probability distribution; State feedback; System performance; Tin;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

American Control Conference, 2001. Proceedings of the 2001

Conference_Location :

Arlington, VA

ISSN :

0743-1619

Print_ISBN :

0-7803-6495-3

Type :

conf

DOI :

10.1109/ACC.2001.945521

Filename :

945521

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1751307