مرکز منطقه ای اطلاع رساني علوم و فناوري - Adversarial Multi-Armed Bandit Approach to Stochastic Optimization

DocumentCode :

2467929

Title :

Adversarial Multi-Armed Bandit Approach to Stochastic Optimization

Author :

Chang, Hyeong Soo ; Fu, Michael C. ; Marcus, Steven I.

Author_Institution :

Dept. of Comput. Sci. & Eng., Sogang Univ., Seoul

fYear :

2006

fDate :

13-15 Dec. 2006

Firstpage :

5681

Lastpage :

5686

Abstract :

We first present a sampling-based algorithm for solving stochastic optimization problems based on the Exp3 algorithm by Auer et al for "adversarial multi-armed bandit problems." The value returned by the algorithm is an upper-bound estimate of the sample-average-maximum for a sample average approximation (SAA) problem induced by the sampling process of the algorithm, and the estimate converges to the optimal objective-function value as the number of samples goes to infinity. We then recursively extend the Exp3-based algorithm for solving a given finite-horizon Markov decision process (MDP) and analyze its finite-time performance in terms of the expected bias relative to the maximum value of the induced recursive SAA problem, showing that the upper bound of the expected bias approaches zero as the sampling size per stage goes to infinity, leading to the convergence to the optimal value of the original MDP problem in the limit

Keywords :

Markov processes; decision theory; optimisation; sampling methods; Markov decision process; multiarmed bandit approach; sample average approximation; sampling-based algorithm; stochastic optimization; Algorithm design and analysis; Approximation algorithms; Computer science; Educational institutions; H infinity control; Probability distribution; Pursuit algorithms; Sampling methods; Stochastic processes; USA Councils;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Decision and Control, 2006 45th IEEE Conference on

Conference_Location :

San Diego, CA

Print_ISBN :

1-4244-0171-2

Type :

conf

DOI :

10.1109/CDC.2006.377724

Filename :

4177233

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2467929