Title :
Estimating Probability Distribution with Q-learning for Biped Gait Generation and Optimization
Author :
Hu, Lingyun ; Zhou, Changjiu ; Sun, Zengqi
Abstract :
A new biped gait generation and optimization method is proposed in the frame of estimation of distribution algorithms (EDAs) with Q-learning method. By formulating the biped gait synthesis as a constrained multi-objective optimization problem, a dynamically stable and low energy cost biped gait is generated by EDAs with Q-learning (EDA_Q), which estimate probability distributions derived from the objective function to be optimized to generate searching points in the highly-coupled and high dimensional working space of biped robots. To get the preferable permutation of the interrelated parameters, Q-learning is combined to build and modify the probability models in EDA autonomously. By making use of the global optimization capability of EDA, the proposed EDA_Q can also solve the local minima problem in traditional Q-learning. On the other hand, with the learning agent, EDA_Q can evaluate the probability distribution model regularly without pre-designed structure and updating rule. The simulation results show that faster and more accurate searching can be achieved to generate preferable biped gait. The gait has been successfully used to drive a soccer-playing humanoid robot called Robo-Erectus which is one of the foremost leading soccer-playing humanoid robots in the RoboCup Humanoid League
Keywords :
humanoid robots; learning (artificial intelligence); legged locomotion; multi-robot systems; optimisation; statistical distributions; Q-learning method; Robo-Erectus; biped gait generation; biped gait synthesis; constrained multi-objective optimization problem; learning agent; probability distribution estimation; soccer-playing humanoid robot; Computer science; Constraint optimization; Cost function; Electronic design automation and methodology; Humanoid robots; Intelligent robots; Legged locomotion; Optimization methods; Power engineering and energy; Probability distribution; Biped robot; Q-learning; estimation distribution algorithm; gait optimization; probability distribution;
Conference_Titel :
Intelligent Robots and Systems, 2006 IEEE/RSJ International Conference on
Conference_Location :
Beijing
Print_ISBN :
1-4244-0259-X
Electronic_ISBN :
1-4244-0259-X
DOI :
10.1109/IROS.2006.281728