• DocumentCode
    188650
  • Title

    nso-HSVI: A Not-So-Optimistic Heuristic Search Value Iteration Algorithm for POMDPs

  • Author

    Feng Liu ; Haibo Li ; Chongjun Wang

  • Author_Institution
    Nat. Key Lab. for Novel Software Technol. Software Inst., Nanjing Univ., Nanjing, China
  • fYear
    2014
  • fDate
    10-12 Nov. 2014
  • Firstpage
    689
  • Lastpage
    693
  • Abstract
    Point-based value iteration methods improve computational efficiency by reducing the search space size. Although global optimization can be obtained by algorithms such as HSVI and GapMin, their exploration of the optimal action is overly optimistic which therefore slows down the efficiency. In this paper, we propose a novel heuristic search method nso-HSVI (not-so-optimistic Heuristic Search Value Iteration) which uses a Monte-Carlo method to estimate the probabilities that actions are optimal according to the distribution of actions´ Q-value function and applies the action of the maximum probability. Experimental results show that nso-HSVI outperforms HSVI, and by a large margin when the scale of the POMDP increases.
  • Keywords
    Markov processes; Monte Carlo methods; iterative methods; optimisation; probability; GapMin; Monte-Carlo method; POMDP; Q-value function; heuristic search method; not-so-optimistic heuristic search value iteration; nso-HSVI; optimization; partially observable Markov decision processes; point-based value iteration methods; probability; search space size; Algorithm design and analysis; Approximation algorithms; Approximation methods; Convergence; Probability density function; Upper bound; Vectors; POMDP; nso-HSVI;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Tools with Artificial Intelligence (ICTAI), 2014 IEEE 26th International Conference on
  • Conference_Location
    Limassol
  • ISSN
    1082-3409
  • Type

    conf

  • DOI
    10.1109/ICTAI.2014.108
  • Filename
    6984544