Title : 
Adaptive control of Markov chains under the weak accessibility
         
        
        
            Author_Institution : 
Dept. of Electr. & Comput. Eng., Wisconsin Univ., Madison, WI, USA
         
        
        
        
        
            Abstract : 
The author considers the adaptive control of Markov chains under the weak accessibility condition with the objective of minimizing the learning loss. First, it is shown that, by using a stationary randomized control scheme, the maximum likelihood estimate of the unknown parameter converges exponentially fast to its true value. Then a certainty equivalence control with a forcing type scheme is constructed with alternative phases of forcing and certainty equivalence control. The stationary randomized control scheme for forcing is used in such a way that by cutting and pasting the resulting observations a single Markov chain is obtained. This in turn allows the rate of forcing to be chosen appropriately, giving a learning loss of O(f(n)log n) for any function f (n)→∞ as n→∞
         
        
            Keywords : 
Markov processes; adaptive control; probability; stochastic systems; Markov chains; adaptive control; certainty equivalence control; convergence; forcing; learning loss minimization; maximum likelihood estimate; stationary randomized control scheme; weak accessibility; Adaptive control; Arm; Costs; Optimal control; Parameter estimation; State-space methods; Stochastic processes; Stochastic systems; Uncertainty;
         
        
        
        
            Conference_Titel : 
Decision and Control, 1990., Proceedings of the 29th IEEE Conference on
         
        
            Conference_Location : 
Honolulu, HI
         
        
        
            DOI : 
10.1109/CDC.1990.203847