Title :
Reinforcement learning methods for multi-linked manipulator obstacle avoidance and control
Author :
Tham, Chen K. ; Prager, Richard W.
Author_Institution :
Dept. of Eng., Cambridge Univ., UK
Abstract :
This paper treats the multi-linked manipulator obstacle avoidance and control task as the interaction between a learning agent and an unknown environment. The role of the agent is to generate actions that maximises the reward that it receives from the environment. We demonstrate how two learning algorithms common in reinforcement learning literature- adaptive heuristic critic (AHC) (Barto et al., 1983), and Q-learning (Watkins, 1989)-can be used to solve the task successfully in two different ways: 1) through the generation of position commands to a PD controller which produces torque commands to drive the manipulator, and 2) through the direct generation of torque commands, removing the need for a PD controller. During the process, the inverse kinematics problem for multi-linked manipulators is automatically solved. Fast function approximation is achieved through the use of an array of cerebellar model arithmetic computers (CMAC). The generation of both discrete and continuous actions are investigated and the performance of the algorithms in terms of learning rates, efficiency of solutions, and memory requirements are evaluated
Keywords :
adaptive systems; function approximation; kinematics; learning systems; neural nets; nonlinear control systems; path planning; CMAC; PD controller; Q-learning; adaptive heuristic critic; cerebellar model arithmetic computers; function approximation; inverse kinematics; multilinked manipulator; obstacle avoidance; position commands; reinforcement learning; torque commands; Adaptive control; Function approximation; Kinematics; Machine learning; Manipulator dynamics; PD control; Programmable control; Robots; Stochastic processes; Torque control;
Conference_Titel :
Motion Control Proceedings, 1993., Asia-Pacific Workshop on Advances in
Print_ISBN :
0-7803-1223-6
DOI :
10.1109/APWAM.1993.316204