• DocumentCode
    2641832
  • Title

    A performance gradient perspective on approximate dynamic programming and its application to partially observable Markov decision processes

  • Author

    Dankert, James ; Yang, Lei ; Si, Jennie

  • Author_Institution
    Department of Electrical Engineering, Arizona State University, Tempe, 85287-5706 USA
  • fYear
    2006
  • fDate
    4-6 Oct. 2006
  • Firstpage
    458
  • Lastpage
    463
  • Abstract
    This paper shows an approach to integrating common approximate dynamic programming (ADP) algorithms into a theoretical framework to address both analytical characteristics and algorithmic features. Several important insights are gained from this analysis, including new approaches to the creation of algorithms. Built on this paradigm, ADP learning algorithms are further developed to address a broader class of problems: optimization with partial observability. This framework is based on an average cost formulation which makes use of the concepts of differential costs and performance gradients to describe learning and optimization algorithms. Numerical simulations are conducted including a queueing problem and a maze problem to illustrate and verify features of the proposed algorithms. Pathways for applying this analysis to adaptive critics are also shown.
  • Keywords
    Algorithm design and analysis; Cost function; Dynamic programming; Equations; Function approximation; Heuristic algorithms; Intelligent control; Observability; Optimization methods; Performance analysis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Aided Control System Design, 2006 IEEE International Conference on Control Applications, 2006 IEEE International Symposium on Intelligent Control, 2006 IEEE
  • Conference_Location
    Munich, Germany
  • Print_ISBN
    0-7803-9797-5
  • Electronic_ISBN
    0-7803-9797-5
  • Type

    conf

  • DOI
    10.1109/CACSD-CCA-ISIC.2006.4776689
  • Filename
    4776689