Title : 
Adaptive control of i.i.d. processes and Markov chains on a compact control set
         
        
        
            Author_Institution : 
Dept. of Electr. & Comput. Eng., Wisconsin Univ., Madison, WI, USA
         
        
        
        
        
            Abstract : 
The author considers the multiarmed bandit problem and the adaptive control of Markov chains with continuous arms that are chosen from a compact subset of IRd. A learning scheme based on a kernel estimator is devised. Using this learning scheme, the author constructs a class of certainty equivalence control with forcing schemes and derives asymptotic upper bounds on their learning loss
         
        
            Keywords : 
Markov processes; adaptive control; estimation theory; game theory; learning systems; Markov chains; adaptive control; asymptotic upper bounds; certainty equivalence control; compact control set; forcing schemes; i.i.d. processes; kernel estimator; learning loss; learning scheme; multiarmed bandit problem; Adaptive control; Arm; Context modeling; Control systems; Kernel; Process control; Stochastic processes; Stochastic systems; Upper bound;
         
        
        
        
            Conference_Titel : 
Decision and Control, 1992., Proceedings of the 31st IEEE Conference on
         
        
            Conference_Location : 
Tucson, AZ
         
        
            Print_ISBN : 
0-7803-0872-7
         
        
        
            DOI : 
10.1109/CDC.1992.371317