• DocumentCode
    2551316
  • Title

    Active exploratory q-learning for large problems

  • Author

    Wu, Xianghai ; Kofman, Jonathan ; Tizhoosh, Hamid R.

  • Author_Institution
    Department of Systems Design Engineering, University of Waterloo, ON, N2L 3G1 Canada
  • fYear
    2007
  • fDate
    7-10 Oct. 2007
  • Firstpage
    4040
  • Lastpage
    4045
  • Abstract
    Although reinforcement learning (RL) emerged more than a decade ago, it is still under extensive investigation in application to large problems, where the states and actions are multi-dimensional and continuous and result in the so- called curse of dimensionality. Conventional RL methods are still not efficient enough in huge state-action spaces, while value-function generalization-based approaches require a very large number of good training examples. This paper presents an active exploratory approach to address the challenge of RL in large problems. The core principle of this approach is that the agent does not rush to the next state. Instead, it attempts a number of actions at the current state first, and then selects the action that returns the greatest immediate reward. The state resulting from performing the action is considered as the next state. Four active exploration algorithms for good actions are proposed: random-based search, opposition-based random search, search by cyclical adjustment, and opposition-based cyclical adjustment of each action dimension. The efficiency of these algorithms is determined by a visual-servoing experiment with a 6-axis robot.
  • Keywords
    Accelerated aging; Convergence; Humans; Learning; Multilayer perceptrons; Neural networks; Neurons; Radio access networks; Resource management; State-space methods;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Systems, Man and Cybernetics, 2007. ISIC. IEEE International Conference on
  • Conference_Location
    Montreal, Que.
  • Print_ISBN
    978-1-4244-0990-7
  • Type

    conf

  • DOI
    10.1109/ICSMC.2007.4414257
  • Filename
    4414257