• DocumentCode
    2810049
  • Title

    A Heuristic Method to Isolate the State Space

  • Author

    Jin, Zhao ; Liu, Weiyi ; Jin, Jian

  • Author_Institution
    Sch. of Inf. Sci. & Eng., Yunnan Univ., Kunming, China
  • fYear
    2009
  • fDate
    11-13 Dec. 2009
  • Firstpage
    1
  • Lastpage
    4
  • Abstract
    Large and complex problem can be solved easily and quickly by decomposing it to be small sub-problems. We propose a heuristic method to isolate the larger state space into some smaller state spaces for decomposing learning task. During the learning process, after remove the state loops in these learned episodes, we find some states are critical for agent can reach goal state. These critical states have two characteristics: 1) they have high probability appeared in all these acyclic episodes; 2) they are the gates for agent can move from a part of state space enter another part of state space. These critical states are called as gate states. So when we block all these gate states, the original larger state space is isolated naturally into some smaller state spaces. Although we can not ensure the isolation is absolutely complete, because the isolation is based on the episodes have been learned. But this method indeed gives agent the capability to decompose its state space according to the knowledge it learned. The experiments on grid-world problem also show the isolation tend to be complete along with the increase of training episodes.
  • Keywords
    learning (artificial intelligence); probability; acyclic episodes; gate states; grid-world problem; heuristic method; probability; reinforcement learning; state space isolation; Face; Humans; Information science; Machine learning; Neural networks; State-space methods;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computational Intelligence and Software Engineering, 2009. CiSE 2009. International Conference on
  • Conference_Location
    Wuhan
  • Print_ISBN
    978-1-4244-4507-3
  • Electronic_ISBN
    978-1-4244-4507-3
  • Type

    conf

  • DOI
    10.1109/CISE.2009.5362940
  • Filename
    5362940