Adaptation in natural and artificial systems
Adaptation in natural and artificial systems
Technical Note: \cal Q-Learning
Machine Learning
Simplifying neural networks by soft weight-sharing
Neural Computation
Learning in embedded systems
Reinforcement learning for robots using neural networks
Reinforcement learning for robots using neural networks
Reinforcement learning with replacing eligibility traces
Machine Learning - Special issue on reinforcement learning
Incremental multi-step Q-learning
Machine Learning - Special issue on reinforcement learning
Machine Learning - Special issue on inductive transfer
Reinforcement learning with self-modifying policies
Learning to learn
Efficient model-based exploration
Proceedings of the fifth international conference on simulation of adaptive behavior on From animals to animats 5
Machine Learning
Learning Team Strategies: Soccer Case Studies
Machine Learning
Finite-sample convergence rates for Q-learning and indirect algorithms
Proceedings of the 1998 conference on Advances in neural information processing systems II
Reinforcement Learning
Neuro-Dynamic Programming
Learning to Predict by the Methods of Temporal Differences
Machine Learning
A Representation for the Adaptive Generation of Simple Sequential Programs
Proceedings of the 1st International Conference on Genetic Algorithms
ICANN '97 Proceedings of the 7th International Conference on Artificial Neural Networks
Experiments with Reinforcement Learning in Problems with Continuous State and Action Spaces
Experiments with Reinforcement Learning in Problems with Continuous State and Action Spaces
Probabilistic incremental program evolution
Evolutionary Computation
AI '02 Proceedings of the 15th Australian Joint Conference on Artificial Intelligence: Advances in Artificial Intelligence
Cooperative Multi-Agent Learning: The State of the Art
Autonomous Agents and Multi-Agent Systems
Hi-index | 0.00 |
We use reinforcement learning (RL) to compute strategies formultiagent soccer teams. RL may profit significantly from worldmodels (WMs) estimating state transition probabilities and rewards.In high-dimensional, continuous input spaces, however, learningaccurate WMs is intractable. Here we show that incomplete WMs canhelp to quickly find good action selection policies. Our approach isbased on a novel combination of CMACs and prioritized sweeping-likealgorithms. Variants thereof outperform both Q(λ)-learningwith CMACs and the evolutionary method Probabilistic IncrementalProgram Evolution (PIPE) which performed best in previouscomparisons.