Real-time obstacle avoidance for manipulators and mobile robots
International Journal of Robotics Research
Proceedings of the seventh international conference (1990) on Machine learning
Robot motion planning: a distributed representation approach
International Journal of Robotics Research
Learning in embedded systems
Hierarchical mixtures of experts and the EM algorithm
Neural Computation
Learning to act using real-time dynamic programming
Artificial Intelligence - Special volume on computational research on interaction and agency, part 1
Brains, Behavior and Robotics
Learning to Predict by the Methods of Temporal Differences
Machine Learning
IWANN '97 Proceedings of the International Work-Conference on Artificial and Natural Neural Networks: Biological and Artificial Computation: From Neuroscience to Technology
Rapid, safe, and incremental learning of navigation strategies
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Hi-index | 0.00 |
Our work focuses on making an autonomous robot manipulatorlearn suitable collision-free motions from local sensory data whileexecuting high-level descriptions of tasks. The robot arm must reacha sequence of targets where it undertakes some manipulation. Therobot manipulator has a sonar sensing skin covering its links toperceive the obstacles in its surroundings. We use reinforcementlearning for that purpose, and the neural controller acquiresappropriate reaction strategies in acceptable time provided it hassome a priori knowledge. This knowledge is specified in two mainways: an appropriate codification of the signals of the neuralcontroller—inputs, outputs and reinforcement—and decompositionof the learning task. The codification facilitates the generalizationcapabilities of the network as it takes advantage of inherentsymmetries and is quite goal-independent. On the other hand,the task of reaching a certain goal position is decomposed intotwo sequential subtasks: negotiate obstacles and move togoal. Experimental results show that the controller achieves a goodperformance incrementally in a reasonable time and exhibits hightolerance to failing sensors.