Title :
Task decomposition and dynamic policy merging in the distributed Q-learning classifier system
Author :
Chapman, Kevin L. ; Bay, John S.
Author_Institution :
Bradley Dept. of Electr. Eng., Virginia Polytech. Inst. & State Univ., Blacksburg, VA, USA
Abstract :
A distributed reinforcement learning system is designed and implemented on a mobile robot for the study of complex task decomposition and dynamic policy merging in real robot learning environments. The distributed Q-learning classifier system (DBLCS) is evolved from the standard LCS proposed by Holland (1996). We address two of the limitations of the LCS through the use of Q-learning as the apportionment of credit component and a distributed learning architecture to facilitate complex task decomposition. The Q-learning update equation is derived and its advantages over the complex bucket brigade algorithm (BBA) are discussed. Holistic and monolithic shaping approaches are used to distribute reward among the learning modules of the DBLCS and allow dynamic policy merging in a variety of real robot learning experiments
Keywords :
learning (artificial intelligence); minimisation; mobile robots; path planning; apportionment of credit; complex bucket brigade algorithm; distributed Q-learning classifier system; distributed reinforcement learning system; dynamic policy merging; holistic shaping; mobile robot; monolithic shaping; task decomposition; Artificial intelligence; Equations; Intelligent robots; Learning systems; Machine learning; Merging; Mobile robots; Robot sensing systems; State-space methods; Unsupervised learning;
Conference_Titel :
Computational Intelligence in Robotics and Automation, 1997. CIRA'97., Proceedings., 1997 IEEE International Symposium on
Conference_Location :
Monterey, CA
Print_ISBN :
0-8186-8138-1
DOI :
10.1109/CIRA.1997.613854