Machine Learning
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Efficient Global Optimization of Expensive Black-Box Functions
Journal of Global Optimization
Algorithms for Inverse Reinforcement Learning
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
The Journal of Machine Learning Research
Least-squares policy iteration
The Journal of Machine Learning Research
Apprenticeship learning via inverse reinforcement learning
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Large Margin Methods for Structured and Interdependent Output Variables
The Journal of Machine Learning Research
A support vector method for multivariate performance measures
ICML '05 Proceedings of the 22nd international conference on Machine learning
Completely Derandomized Self-Adaptation in Evolution Strategies
Evolutionary Computation
Training linear SVMs in linear time
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Proceedings of the 25th international conference on Machine learning
Hoeffding and Bernstein races for selecting policies in evolutionary direct policy search
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Autonomous Agents and Multi-Agent Systems
Algorithms for Reinforcement Learning
Algorithms for Reinforcement Learning
ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part I
Preference-based policy learning
ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part I
Preference-based policy iteration: leveraging preference learning for reinforcement learning
ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part I
On Learning, Representing, and Generalizing a Task in a Humanoid Robot
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Monte carlo methods for preference learning
LION'12 Proceedings of the 6th international conference on Learning and Intelligent Optimization
Machine learning for interactive systems and robots: a brief introduction
Proceedings of the 2nd Workshop on Machine Learning for Interactive Systems: Bridging the Gap Between Perception, Action and Communication
Hi-index | 0.00 |
This paper focuses on reinforcement learning (RL) with limited prior knowledge. In the domain of swarm robotics for instance, the expert can hardly design a reward function or demonstrate the target behavior, forbidding the use of both standard RL and inverse reinforcement learning. Although with a limited expertise, the human expert is still often able to emit preferences and rank the agent demonstrations. Earlier work has presented an iterative preference-based RL framework: expert preferences are exploited to learn an approximate policy return, thus enabling the agent to achieve direct policy search. Iteratively, the agent selects a new candidate policy and demonstrates it; the expert ranks the new demonstration comparatively to the previous best one; the expert's ranking feedback enables the agent to refine the approximate policy return, and the process is iterated. In this paper, preference-based reinforcement learning is combined with active ranking in order to decrease the number of ranking queries to the expert needed to yield a satisfactory policy. Experiments on the mountain car and the cancer treatment testbeds witness that a couple of dozen rankings enable to learn a competent policy.