Title :
Adaptive control of i.i.d. processes and Markov chains on a compact control set
Author_Institution :
Dept. of Electr. & Comput. Eng., Wisconsin Univ., Madison, WI, USA
Abstract :
The author considers the multiarmed bandit problem and the adaptive control of Markov chains with continuous arms that are chosen from a compact subset of IRd. A learning scheme based on a kernel estimator is devised. Using this learning scheme, the author constructs a class of certainty equivalence control with forcing schemes and derives asymptotic upper bounds on their learning loss
Keywords :
Markov processes; adaptive control; estimation theory; game theory; learning systems; Markov chains; adaptive control; asymptotic upper bounds; certainty equivalence control; compact control set; forcing schemes; i.i.d. processes; kernel estimator; learning loss; learning scheme; multiarmed bandit problem; Adaptive control; Arm; Context modeling; Control systems; Kernel; Process control; Stochastic processes; Stochastic systems; Upper bound;
Conference_Titel :
Decision and Control, 1992., Proceedings of the 31st IEEE Conference on
Conference_Location :
Tucson, AZ
Print_ISBN :
0-7803-0872-7
DOI :
10.1109/CDC.1992.371317