DocumentCode
188650
Title
nso-HSVI: A Not-So-Optimistic Heuristic Search Value Iteration Algorithm for POMDPs
Author
Feng Liu ; Haibo Li ; Chongjun Wang
Author_Institution
Nat. Key Lab. for Novel Software Technol. Software Inst., Nanjing Univ., Nanjing, China
fYear
2014
fDate
10-12 Nov. 2014
Firstpage
689
Lastpage
693
Abstract
Point-based value iteration methods improve computational efficiency by reducing the search space size. Although global optimization can be obtained by algorithms such as HSVI and GapMin, their exploration of the optimal action is overly optimistic which therefore slows down the efficiency. In this paper, we propose a novel heuristic search method nso-HSVI (not-so-optimistic Heuristic Search Value Iteration) which uses a Monte-Carlo method to estimate the probabilities that actions are optimal according to the distribution of actions´ Q-value function and applies the action of the maximum probability. Experimental results show that nso-HSVI outperforms HSVI, and by a large margin when the scale of the POMDP increases.
Keywords
Markov processes; Monte Carlo methods; iterative methods; optimisation; probability; GapMin; Monte-Carlo method; POMDP; Q-value function; heuristic search method; not-so-optimistic heuristic search value iteration; nso-HSVI; optimization; partially observable Markov decision processes; point-based value iteration methods; probability; search space size; Algorithm design and analysis; Approximation algorithms; Approximation methods; Convergence; Probability density function; Upper bound; Vectors; POMDP; nso-HSVI;
fLanguage
English
Publisher
ieee
Conference_Titel
Tools with Artificial Intelligence (ICTAI), 2014 IEEE 26th International Conference on
Conference_Location
Limassol
ISSN
1082-3409
Type
conf
DOI
10.1109/ICTAI.2014.108
Filename
6984544
Link To Document