A qualitative physics based on confluences
Artificial Intelligence - Special volume on qualitative reasoning about physical systems
Macro-operators: a weak method for learning
Artificial Intelligence - Lecture notes in computer science 178
Learning to solve problems by searching for macro-operators
Learning to solve problems by searching for macro-operators
Planning as search: a quantitative approach
Artificial Intelligence
A bottom-up mechanism for behavior selection in an artificial creature
Proceedings of the first international conference on simulation of adaptive behavior on From animals to animats
Automatic programming of behavior-based robots using reinforcement learning
Artificial Intelligence
Technical Note: \cal Q-Learning
Machine Learning
Asynchronous Stochastic Approximation and Q-Learning
Machine Learning
Robot shaping: developing autonomous agents through learning
Artificial Intelligence
Learning to act using real-time dynamic programming
Artificial Intelligence - Special volume on computational research on interaction and agency, part 1
Feature-based methods for large scale dynamic programming
Machine Learning - Special issue on reinforcement learning
The loss from imperfect value functions in expectation-based and minimax-based tasks
Machine Learning - Special issue on reinforcement learning
Qualitative system identification: deriving structure from behavior
Artificial Intelligence
Purposive behavior acquisition for a real robot by vision-based reinforcement learning
Machine Learning - Special issue on robot learning
Studies in hybrid systems: modeling, analysis, and control
Studies in hybrid systems: modeling, analysis, and control
Modeling agents as qualitative decision makers
Artificial Intelligence - Special issue on economic principles of multi-agent systems
Adaptive Behavior
Reinforcement learning with hierarchies of machines
NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
How to dynamically merge Markov decision processes
NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
Reinforcement Learning in the Multi-Robot Domain
Autonomous Robots
Hybrid Systems
ECML '97 Proceedings of the 9th European Conference on Machine Learning
Learning and Exploitation Do Not Conflict Under Minimax Optimality
ECML '97 Proceedings of the 9th European Conference on Machine Learning
Hierarchical Hybrid Control: A Case Study
Hybrid Systems II
Dynamic Programming
Temporal credit assignment in reinforcement learning
Temporal credit assignment in reinforcement learning
Algorithms for sequential decision-making
Algorithms for sequential decision-making
Human Problem Solving
Reinforcement learning: a survey
Journal of Artificial Intelligence Research
Behavior analysis and training-a methodology for behaviorengineering
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Acquiring Mobile Robot Behaviors by Learning Trajectory Velocities
Autonomous Robots
Hi-index | 0.00 |
The behavior of reinforcement learning (RL)algorithms is best understood in completely observable, discrete-timecontrolled Markov chains with finite state and action spaces. Incontrast, robot-learning domains are inherently continuous both intime and space, and moreover are partially observable. Here wesuggest a systematic approach to solve such problems in which theavailable qualitative and quantitative knowledge is used to reducethe complexity of learning task. The steps of the design process areto: (i) decompose the task into subtasks using the qualitativeknowledge at hand; (ii) design local controllers to solve thesubtasks using the available quantitative knowledge, and (iii) learna coordination of these controllers by means of reinforcementlearning. It is argued that the approach enables fast,semi-automatic, but still high quality robot-control as nofine-tuning of the local controllers is needed. The approach wasverified on a non-trivial real-life robot task. Several RLalgorithms were compared by ANOVA and it was found that themodel-based approach worked significantly better than the model-freeapproach. The learnt switching strategy performed comparably to ahandcrafted version. Moreover, the learnt strategy seemed to exploitcertain properties of the environment which were not foreseen inadvance, thus supporting the view that adaptive algorithms areadvantageous to nonadaptive ones in complex environments.