• DocumentCode
    2765653
  • Title

    Dynamic Exploration in Q(λ)-learning

  • Author

    van Ast, J. ; Babuska, Robert

  • Author_Institution
    Delft Center for Systems and Control, Delft University of Technology, Mekelweg 2, 2628 CD Delft, the Netherlands
  • fYear
    2006
  • fDate
    16-21 July 2006
  • Firstpage
    41
  • Lastpage
    46
  • Abstract
    Reinforcement learning has proved its value in solving complex optimization tasks. However, the learning time for even simple problems is typically very long. Efficient exploration of the state-action space is therefore crucial for effective learning. This paper introduces a new type of exploration, called dynamic exploration. It differs from the existing exploration methods (both directed and undirected) in that it makes exploration a function of the action selected in the previous time step. In our approach, states can either belong to long-path states, where the optimal action is the same as the optimal action in the previous state, or to switch states, where the action is different. In realistic learning problems, the number of long-path states exceeds the number of switch states. Given this information, the exploration method can explore the state-space more efficiently. Experiments on different gridworld optimization tasks demonstrate the reduction of learning time with dynamic exploration.
  • Keywords
    Control systems; Convergence; Costs; Dynamic programming; Frequency; Learning; Process control; Search problems; Space technology; Switches;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Neural Networks, 2006. IJCNN '06. International Joint Conference on
  • Conference_Location
    Vancouver, BC
  • Print_ISBN
    0-7803-9490-9
  • Type

    conf

  • DOI
    10.1109/IJCNN.2006.246657
  • Filename
    1716068